date:20160314

[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER

2016-03-14 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194756#comment-15194756
 ] 

Wei-Chiu Chuang commented on HDFS-9956:
---

Sounds to me this is a really bad thing that LDAP group mapping could fail over 
a name node. We should investigate why this is happening.

I've had worked on a new LDAP group mapping implementation, but I've not 
finished it yet. I'll prioritize that too.

> LDAP PERFORMANCE ISSUE AND FAIL OVER
> 
>
> Key: HDFS-9956
> URL: https://issues.apache.org/jira/browse/HDFS-9956
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: sanjay kenganahalli vamanna
>
> The typical LDAP group name resolution works well under typical scenarios. 
> However, we have seen cases where a user is mapped to many groups (in an 
> extreme case, a user is mapped to more than 100 groups). The way it's being 
> implemented now makes this case super slow resolving groups from 
> ActiveDirectory and making the namenode to failover.
> Instead of failover, we can use the 
> parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to 
> time-out and send the failed response back to the user so that we can prevent 
> name node failover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9928) Make HDFS commands guide up to date

2016-03-14 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9928:
--
Attachment: HDFS-9928-trunk.003.patch

Thanks [~iwasakims] again for the review. Here's the patch for trunk.

I've manually verified all command line options against this patch. Hopefully 
I'm not missing anything. The only thing added in addition to [~iwasakims]'s 
suggestion is {{hdfs envvars}} command, which is in trunk only.

> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, 
> HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9951) Use string constants for XML tags in OfflineImageReconstructor

2016-03-14 Thread Lin Yiqun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9951:

Attachment: HDFS-9551.002.patch

Thanks [~cmccabe] for comments. That's right that {{PBImageXmlWriter.java}} 
should use the same constants. Update the latest patch for addressing your 
comments and and fixing checkstyle issues.

> Use string constants for XML tags in OfflineImageReconstructor
> --
>
> Key: HDFS-9951
> URL: https://issues.apache.org/jira/browse/HDFS-9951
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Minor
> Attachments: HDFS-9551.001.patch, HDFS-9551.002.patch
>
>
> In class {{OfflineImageReconstructor}}, it uses many {{SectionProcessors}} to 
> process xml files and load the subtree of the XML into a Node structure. But 
> there are lots of places that node removes key by directively writing value 
> in methods rather than define them first. Like this:
> {code}
> Node expiration = directive.removeChild("expiration");
> {code}
> We could improve this to define them in Node and them invoked like this way:
> {code}
> Node expiration=directive.removeChild(Node.CACHE_MANAGER_SECTION_EXPIRATION);
> {code}
> And it will be good to manager node key's name in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9961) Ozone: Add buckets commands to CLI

2016-03-14 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9961:
---
Status: Patch Available  (was: Open)

> Ozone: Add buckets commands to CLI
> --
>
> Key: HDFS-9961
> URL: https://issues.apache.org/jira/browse/HDFS-9961
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9961-HDFS-7240.001.patch
>
>
> Add command for buckets to ozone CLI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9961) Ozone: Add buckets commands to CLI

2016-03-14 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9961:
---
Attachment: HDFS-9961-HDFS-7240.001.patch

> Ozone: Add buckets commands to CLI
> --
>
> Key: HDFS-9961
> URL: https://issues.apache.org/jira/browse/HDFS-9961
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9961-HDFS-7240.001.patch
>
>
> Add command for buckets to ozone CLI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9961) Ozone: Add buckets commands to CLI

2016-03-14 Thread Anu Engineer (JIRA)

Anu Engineer created HDFS-9961:
--

 Summary: Ozone: Add buckets commands to CLI
 Key: HDFS-9961
 URL: https://issues.apache.org/jira/browse/HDFS-9961
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Affects Versions: HDFS-7240
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: HDFS-7240


Add command for buckets to ozone CLI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9960) OzoneHandler : Add localstorage support for keys

2016-03-14 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9960:
---
Attachment: HDFS-9960-HDFS-7240.001.patch

> OzoneHandler : Add localstorage support for keys
> 
>
> Key: HDFS-9960
> URL: https://issues.apache.org/jira/browse/HDFS-9960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9960-HDFS-7240.001.patch
>
>
> Adds local storage handler support for keys. This allows all REST api's to be 
> exercised via MiniDFScluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9960) OzoneHandler : Add localstorage support for keys

2016-03-14 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9960:
---
Status: Patch Available  (was: Open)

> OzoneHandler : Add localstorage support for keys
> 
>
> Key: HDFS-9960
> URL: https://issues.apache.org/jira/browse/HDFS-9960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9960-HDFS-7240.001.patch
>
>
> Adds local storage handler support for keys. This allows all REST api's to be 
> exercised via MiniDFScluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-14 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9918:
---
Attachment: (was: HDFS-9918-003.patch)

> Erasure Coding: Sort located striped blocks based on decommissioned states
> --
>
> Key: HDFS-9918
> URL: https://issues.apache.org/jira/browse/HDFS-9918
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch, 
> HDFS-9918-003.patch
>
>
> This jira is a follow-on work of HDFS-8786, where we do decommissioning of 
> datanodes having striped blocks.
> Now, after decommissioning it requires to change the ordering of the storage 
> list so that the decommissioned datanodes should only be last node in list.
> For example, assume we have a block group with storage list:-
> d0, d1, d2, d3, d4, d5, d6, d7, d8, d9
> mapping to indices
> 0, 1, 2, 3, 4, 5, 6, 7, 8, 2
> Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a 
> decommissioning node then should switch d2 and d9 in the storage list.
> Thanks [~jingzhao] for the 
> [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-14 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194639#comment-15194639
 ] 

Rakesh R commented on HDFS-9918:


Attached patch fixing findbugs

> Erasure Coding: Sort located striped blocks based on decommissioned states
> --
>
> Key: HDFS-9918
> URL: https://issues.apache.org/jira/browse/HDFS-9918
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch, 
> HDFS-9918-003.patch
>
>
> This jira is a follow-on work of HDFS-8786, where we do decommissioning of 
> datanodes having striped blocks.
> Now, after decommissioning it requires to change the ordering of the storage 
> list so that the decommissioned datanodes should only be last node in list.
> For example, assume we have a block group with storage list:-
> d0, d1, d2, d3, d4, d5, d6, d7, d8, d9
> mapping to indices
> 0, 1, 2, 3, 4, 5, 6, 7, 8, 2
> Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a 
> decommissioning node then should switch d2 and d9 in the storage list.
> Thanks [~jingzhao] for the 
> [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-14 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9918:
---
Attachment: HDFS-9918-003.patch
HDFS-9918-003.patch

> Erasure Coding: Sort located striped blocks based on decommissioned states
> --
>
> Key: HDFS-9918
> URL: https://issues.apache.org/jira/browse/HDFS-9918
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch, 
> HDFS-9918-003.patch
>
>
> This jira is a follow-on work of HDFS-8786, where we do decommissioning of 
> datanodes having striped blocks.
> Now, after decommissioning it requires to change the ordering of the storage 
> list so that the decommissioned datanodes should only be last node in list.
> For example, assume we have a block group with storage list:-
> d0, d1, d2, d3, d4, d5, d6, d7, d8, d9
> mapping to indices
> 0, 1, 2, 3, 4, 5, 6, 7, 8, 2
> Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a 
> decommissioning node then should switch d2 and d9 in the storage list.
> Thanks [~jingzhao] for the 
> [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]

2016-03-14 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194566#comment-15194566
 ] 

Rakesh R commented on HDFS-9857:


Thanks [~zhz]. Done the changes through IDE refactoring way.

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-1]
> ---
>
> Key: HDFS-9857
> URL: https://issues.apache.org/jira/browse/HDFS-9857
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9857-001.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}}
> - {{neededReplications}} to {{neededReconstruction}}
> - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}}
> Thanks [~zhz], [~andrew.wang] for the useful 
> [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7987) Allow files / directories to be moved

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194560#comment-15194560
 ] 

Hadoop QA commented on HDFS-7987:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 0m 54s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793421/HDFS-7987.03.patch |
| JIRA Issue | HDFS-7987 |
| Optional Tests |  asflicense  |
| uname | Linux 5c20f6bc0146 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 19e8f07 |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14818/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Allow files / directories to be moved
> -
>
> Key: HDFS-7987
> URL: https://issues.apache.org/jira/browse/HDFS-7987
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-7987.01.patch, HDFS-7987.02.patch, 
> HDFS-7987.03.patch
>
>
> Users should be able to move files / directories using the Namenode UI. 
> WebHDFS supports a rename operation that can be used for this purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9953) Download File from UI broken after pagination

2016-03-14 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194522#comment-15194522
 ] 

Brahma Reddy Battula commented on HDFS-9953:


Thanks [~raviprak] for commit and thanks [~cnauroth] for taking care this 
issue..

> Download File from UI broken after pagination
> -
>
> Key: HDFS-9953
> URL: https://issues.apache.org/jira/browse/HDFS-9953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: HDFS-9953.patch
>
>
>  File links not working on second page onwards. this was introduced in 
> HDFS-9084.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9953) Download File from UI broken after pagination

2016-03-14 Thread Vinayakumar B (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9953:

Fix Version/s: 2.8.0

> Download File from UI broken after pagination
> -
>
> Key: HDFS-9953
> URL: https://issues.apache.org/jira/browse/HDFS-9953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: HDFS-9953.patch
>
>
>  File links not working on second page onwards. this was introduced in 
> HDFS-9084.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.

2016-03-14 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194518#comment-15194518
 ] 

Brahma Reddy Battula commented on HDFS-9917:


Before Full BR, all pending IBRs will be flushed.
In current problem case, size of IBR itself is huge than FBR,IBR itself failed. 
because NN was not able to process it completely. thats why it kept 
accumulating.


> IBR accumulate more objects when SNN was down for sometime.
> ---
>
> Key: HDFS-9917
> URL: https://issues.apache.org/jira/browse/HDFS-9917
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> SNN was down for sometime because of some reasons..After restarting SNN,it 
> became unreponsive because 
> - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), 
> where as each datanode had only ~2.5 million blocks.
> - GC can't trigger on this objects since all will be under RPC queue. 
> To recover this( to clear this objects) ,restarted all the DN's one by 
> one..This issue happened in 2.4.1 where split of blockreport was not 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units

2016-03-14 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194511#comment-15194511
 ] 

Arpit Agarwal commented on HDFS-9847:
-

[~linyiqun], do you want to update the patch to address Chris's feedback?
bq. Adding new methods to parameterize getTimeDuration is unnecessary. The 
caller should cast the result if it's losing precision.

We can punt the {{w}}/{{y}} decision to a separate Jira. Adding support for the 
remaining suffixes will be a step forward.

> HDFS configuration without time unit name should accept friendly time units
> ---
>
> Key: HDFS-9847
> URL: https://issues.apache.org/jira/browse/HDFS-9847
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9847.001.patch, HDFS-9847.002.patch, 
> timeduration-w-y.patch
>
>
> In HDFS-9821, it talks about the issue of leting existing keys use friendly 
> units e.g. 60s, 5m, 1d, 6w etc. But there are som configuration key names 
> contain time unit name, like {{dfs.blockreport.intervalMsec}}, so we can make 
> some other configurations which without time unit name to accept friendly 
> time units. The time unit  {{seconds}} is frequently used in hdfs. We can 
> updating this configurations first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194501#comment-15194501
 ] 

Arpit Agarwal commented on HDFS-9959:
-

bq. In real production cluster, this should not happen often, otherwise there 
will be lots of corrupt files.
I second [~szetszwo]'s suggestion of moving this statement out of the lock if 
possible. Also let's log this at INFO level.

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194500#comment-15194500
 ] 

Tsz Wo Nicholas Sze edited comment on HDFS-9959 at 3/15/16 1:07 AM:


> ... In real production cluster, this should not happen often, otherwise there 
> will be lots of corrupt files.

This is based on assumptions that the cluster is well managed and the users are 
not creating single replica files.  These assumptions may not be true, 
unfortunately.


was (Author: szetszwo):
> ... In real production cluster, this should not happen often, otherwise there 
> will be lots of corrupt files.

This is based on assumptions that the cluster is well managed and the users are 
not creating single replica files.  The assumptions may not be true.

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194500#comment-15194500
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9959:
---

> ... In real production cluster, this should not happen often, otherwise there 
> will be lots of corrupt files.

This is based on assumptions that the cluster is well managed and the users are 
not creating single replica files.  The assumptions may not be true.

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194497#comment-15194497
 ] 

Hadoop QA commented on HDFS-9349:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HDFS-9349 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12788281/HDFS-9349-HDFS-9000.004.patch
 |
| JIRA Issue | HDFS-9349 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14816/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Support reconfiguring fs.protected.directories without NN restart
> -
>
> Key: HDFS-9349
> URL: https://issues.apache.org/jira/browse/HDFS-9349
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9349-HDFS-9000.003.patch, 
> HDFS-9349-HDFS-9000.004.patch, HDFS-9349.001.patch, HDFS-9349.002.patch
>
>
> This is to reconfigure
> {code}
> fs.protected.directories
> {code}
> without restarting NN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9945) Datanode command for evicting writers

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194491#comment-15194491
 ] 

Hadoop QA commented on HDFS-9945:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 38s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 34s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s 
{color} | {color:red} hadoop-hdfs-project: patch generated 2 new + 538 
unchanged - 4 fixed = 540 total (was 542) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 26s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 4s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 27s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} |

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread yunjiong zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194482#comment-15194482
 ] 

yunjiong zhao commented on HDFS-9959:
-

It shouldn't, it will only print those blocks which was removed from last live 
datanode and do belong to a file. In real production cluster, this should not 
happen often, otherwise there will be lots of corrupt files.


> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194472#comment-15194472
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9959:
---

Then there could be a lot of blocks in this case since a datanode may contains 
many blocks fall in such case.  It is a problem since NN cannot do anything but 
printing the log message.  We need to print it outside the lock.

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9960) OzoneHandler : Add localstorage support for keys

2016-03-14 Thread Anu Engineer (JIRA)

Anu Engineer created HDFS-9960:
--

 Summary: OzoneHandler : Add localstorage support for keys
 Key: HDFS-9960
 URL: https://issues.apache.org/jira/browse/HDFS-9960
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Affects Versions: HDFS-7240
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: HDFS-7240


Adds local storage handler support for keys. This allows all REST api's to be 
exercised via MiniDFScluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194428#comment-15194428
 ] 

Hadoop QA commented on HDFS-9959:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HDFS-9959 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793402/HDFS-9959.patch |
| JIRA Issue | HDFS-9959 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14815/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8030) HDFS Erasure Coding Phase II -- EC with contiguous layout

2016-03-14 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194425#comment-15194425
 ] 

Uma Maheswara Rao G commented on HDFS-8030:
---

[~zhz] Me and Rakesh discussed again for some time on the document. Please find 
the points to discuss/address for your opinion.
# As this design tries to convert files into EC mode from normal file layout, 
Blockgroups needs to be created later when converting. But block groups 
generally we allocate continuous blockids, but here how do we make that 
continuous blockids when converting?
# Does this create overheads on memory as we need to track blockGroups 
separately and if the blockids are not continuous as discussed in #1
#  parity creation by reading whole 6blocks which will be 6*256MB into memory. 
Do we need to think more on this point I think. We may need to think like 
contiguous blockid but Parity generation based on Stripes?
Blk_0, Blk_1…Blk_5 are the contiguous blocks. We read cell from each block and 
just treat as stripe and generate 3 parity. Continue until finishes all all 
data in blocks. Need to think more on this.
# Do we support mixed zone? contains both striped files and contiguous EC files?
Please others also review the document and get the feedbacks.

> HDFS Erasure Coding Phase II -- EC with contiguous layout
> -
>
> Key: HDFS-8030
> URL: https://issues.apache.org/jira/browse/HDFS-8030
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: erasure-coding
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFSErasureCodingPhaseII-20151204.pdf
>
>
> Data redundancy form -- replication or erasure coding, should be orthogonal 
> to block layout -- contiguous or striped. This JIRA explores the combination 
> of {{Erasure Coding}} + {{Contiguous}} block layout.
> As will be detailed in the design document, key benefits include preserving 
> block locality, and easy conversion between hot and cold modes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9901) Move disk IO out of the heartbeat thread

2016-03-14 Thread Hua Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua Liu updated HDFS-9901:
--
Attachment: 0004-HDFS-9901-move-diskIO-out-of-the-heartbeat-thread.patch

> Move disk IO out of the heartbeat thread
> 
>
> Key: HDFS-9901
> URL: https://issues.apache.org/jira/browse/HDFS-9901
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Hua Liu
>Assignee: Hua Liu
> Attachments: 
> 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0002-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0003-HDFS-9901-Move-disk-IO-out-of-the-heartbeat-thread.patch, 
> 0004-HDFS-9901-move-diskIO-out-of-the-heartbeat-thread.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, 
> which checks the existence and length of a block before spins off a thread to 
> do the actual transferring. In extreme cases, the heartbeat thread hang more 
> than 10 minutes so the namenode marked the datanode as dead and started 
> replicating its blocks, which caused more disk IO on other nodes and can 
> potentially brought them down.
> The patch contains two changes:
> 1. Makes DF asynchronous when monitoring the disk by creating a thread that 
> checks the disk and updates the disk status periodically. When the heartbeat 
> threads generates storage report, it then reads disk usage information from 
> memory so that the heartbeat thread won't get blocked during heavy diskIO. 
> 2. Makes the checks (which required disk accesses) in transferBlock() in 
> DataNode into a separate thread so the heartbeat does not have to wait for 
> this when heartbeating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9901) Move disk IO out of the heartbeat thread

2016-03-14 Thread Hua Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194414#comment-15194414
 ] 

Hua Liu commented on HDFS-9901:
---

Added comments for DFRefreshThread and DataCheckAndTransfer.

> Move disk IO out of the heartbeat thread
> 
>
> Key: HDFS-9901
> URL: https://issues.apache.org/jira/browse/HDFS-9901
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Hua Liu
>Assignee: Hua Liu
> Attachments: 
> 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0002-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0003-HDFS-9901-Move-disk-IO-out-of-the-heartbeat-thread.patch, 
> 0004-HDFS-9901-move-diskIO-out-of-the-heartbeat-thread.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, 
> which checks the existence and length of a block before spins off a thread to 
> do the actual transferring. In extreme cases, the heartbeat thread hang more 
> than 10 minutes so the namenode marked the datanode as dead and started 
> replicating its blocks, which caused more disk IO on other nodes and can 
> potentially brought them down.
> The patch contains two changes:
> 1. Makes DF asynchronous when monitoring the disk by creating a thread that 
> checks the disk and updates the disk status periodically. When the heartbeat 
> threads generates storage report, it then reads disk usage information from 
> memory so that the heartbeat thread won't get blocked during heavy diskIO. 
> 2. Makes the checks (which required disk accesses) in transferBlock() in 
> DataNode into a separate thread so the heartbeat does not have to wait for 
> this when heartbeating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9901) Move disk IO out of the heartbeat thread

2016-03-14 Thread Hua Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua Liu updated HDFS-9901:
--
Status: Patch Available  (was: Open)

> Move disk IO out of the heartbeat thread
> 
>
> Key: HDFS-9901
> URL: https://issues.apache.org/jira/browse/HDFS-9901
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Hua Liu
>Assignee: Hua Liu
> Attachments: 
> 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0002-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0003-HDFS-9901-Move-disk-IO-out-of-the-heartbeat-thread.patch, 
> 0004-HDFS-9901-move-diskIO-out-of-the-heartbeat-thread.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, 
> which checks the existence and length of a block before spins off a thread to 
> do the actual transferring. In extreme cases, the heartbeat thread hang more 
> than 10 minutes so the namenode marked the datanode as dead and started 
> replicating its blocks, which caused more disk IO on other nodes and can 
> potentially brought them down.
> The patch contains two changes:
> 1. Makes DF asynchronous when monitoring the disk by creating a thread that 
> checks the disk and updates the disk status periodically. When the heartbeat 
> threads generates storage report, it then reads disk usage information from 
> memory so that the heartbeat thread won't get blocked during heavy diskIO. 
> 2. Makes the checks (which required disk accesses) in transferBlock() in 
> DataNode into a separate thread so the heartbeat does not have to wait for 
> this when heartbeating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9901) Move disk IO out of the heartbeat thread

2016-03-14 Thread Hua Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hua Liu updated HDFS-9901:
--
Status: Open  (was: Patch Available)

> Move disk IO out of the heartbeat thread
> 
>
> Key: HDFS-9901
> URL: https://issues.apache.org/jira/browse/HDFS-9901
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Hua Liu
>Assignee: Hua Liu
> Attachments: 
> 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0002-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 
> 0003-HDFS-9901-Move-disk-IO-out-of-the-heartbeat-thread.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, 
> which checks the existence and length of a block before spins off a thread to 
> do the actual transferring. In extreme cases, the heartbeat thread hang more 
> than 10 minutes so the namenode marked the datanode as dead and started 
> replicating its blocks, which caused more disk IO on other nodes and can 
> potentially brought them down.
> The patch contains two changes:
> 1. Makes DF asynchronous when monitoring the disk by creating a thread that 
> checks the disk and updates the disk status periodically. When the heartbeat 
> threads generates storage report, it then reads disk usage information from 
> memory so that the heartbeat thread won't get blocked during heavy diskIO. 
> 2. Makes the checks (which required disk accesses) in transferBlock() in 
> DataNode into a separate thread so the heartbeat does not have to wait for 
> this when heartbeating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]

2016-03-14 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194409#comment-15194409
 ] 

Zhe Zhang commented on HDFS-9857:
-

Thanks Rakesh. The naming changes in the summary LGTM. Were all changes done 
with IDE refactoring? I'll review the entire patch shortly.

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-1]
> ---
>
> Key: HDFS-9857
> URL: https://issues.apache.org/jira/browse/HDFS-9857
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9857-001.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}}
> - {{neededReplications}} to {{neededReconstruction}}
> - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}}
> Thanks [~zhz], [~andrew.wang] for the useful 
> [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9944) Ozone : Add container dispatcher

2016-03-14 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9944:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

+1 for patch v002.  The last test failure was unrelated.  I have committed this 
to the HDFS-7240 feature branch.  [~anu], thank you!

> Ozone : Add container dispatcher
> 
>
> Key: HDFS-9944
> URL: https://issues.apache.org/jira/browse/HDFS-9944
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9944-HDFS-7240.001.patch, 
> HDFS-9944-HDFS-7240.002.patch
>
>
> This patch takes a request packet from the network layer and delivers it to 
> the container manager. ie. OzoneContainerManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-9959:
--
Component/s: namenode

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9944) Ozone : Add container dispatcher

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194354#comment-15194354
 ] 

Hadoop QA commented on HDFS-9944:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
1s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s 
{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
12s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 32s 
{color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 31s 
{color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 55m 26s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.8.0_74. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 21s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
25s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 137m 15s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793378/HDFS-9944-HDFS-7240.002.patch
 |
| JIRA Issue | HDFS-9944 |
| Optional Tests |

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread yunjiong zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194351#comment-15194351
 ] 

yunjiong zhao commented on HDFS-9959:
-

If removeNode(Block b, DatanodeDescriptor node) was invoked by 
DatanodeManager.removeDatanode due to DatanodeManager found a lost heartbeat 
datanode, the block may still stored on that datanode's disk safely. So after 
resolve the temporary issue (power, network...)  and start the datanode process 
again,  we will have missing block back.



> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194335#comment-15194335
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9959:
---

> For performance penalty, I think it should be fine because it won't generate 
> lots of new message.

This is a good point.

> ... should help to identify which datanode should be fixed first to recover 
> missing blocks.

The block is already deleted from the last datanode.  How would you expect the 
new message to help?

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7987) Allow files / directories to be moved

2016-03-14 Thread Ravi Prakash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-7987:
---
Attachment: HDFS-7987.03.patch

Here's an updated patch after HDFS-9953.

> Allow files / directories to be moved
> -
>
> Key: HDFS-7987
> URL: https://issues.apache.org/jira/browse/HDFS-7987
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
> Attachments: HDFS-7987.01.patch, HDFS-7987.02.patch, 
> HDFS-7987.03.patch
>
>
> Users should be able to move files / directories using the Namenode UI. 
> WebHDFS supports a rename operation that can be used for this purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194315#comment-15194315
 ] 

Hudson commented on HDFS-9947:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9460 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9460/])
HDFS-9947. Block#toString should not output information from derived (cmccabe: 
rev 9a43094e12ab8d35d49ceda2e2c5f83093bb3a5b)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/Block.java


> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: HDFS-9947.001.patch
>
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9947:
---
Affects Version/s: (was: 2.8.0)
   2.9.0

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: HDFS-9947.001.patch
>
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9947:
---
  Resolution: Fixed
   Fix Version/s: 2.9.0
Target Version/s: 2.9.0  (was: 2.8.0)
  Status: Resolved  (was: Patch Available)

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: HDFS-9947.001.patch
>
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194243#comment-15194243
 ] 

Colin Patrick McCabe commented on HDFS-9947:


Thanks, [~cnauroth].  committed to 2.9

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: HDFS-9947.001.patch
>
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread yunjiong zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194245#comment-15194245
 ] 

yunjiong zhao commented on HDFS-9959:
-

For other logs, it is not that convenient. For example, if the block was 
created years ago, we may not find anything in recent BlockStateChange log.

For performance penalty, I think it should be fine because it won't generate 
lots of new message.

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194220#comment-15194220
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9959:
---

For recovering blk_1073741825, we may search other logs for it to find out all 
the locations.  Do we really need this new message?

BTW, the new message is printed within the write lock.  So, it has performance 
penalty.

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread yunjiong zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yunjiong zhao updated HDFS-9959:

Status: Patch Available  (was: Open)

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread yunjiong zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yunjiong zhao updated HDFS-9959:

Attachment: HDFS-9959.patch

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9953) Download File from UI broken after pagination

2016-03-14 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194193#comment-15194193
 ] 

Chris Nauroth commented on HDFS-9953:
-

[~raviprak], thank you!

> Download File from UI broken after pagination
> -
>
> Key: HDFS-9953
> URL: https://issues.apache.org/jira/browse/HDFS-9953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Attachments: HDFS-9953.patch
>
>
>  File links not working on second page onwards. this was introduced in 
> HDFS-9084.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9953) Download File from UI broken after pagination

2016-03-14 Thread Ravi Prakash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-9953:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

I've committed this to branch-2.8, branch-2 and trunk. Thanks a lot Brahma, 
Chris and everyone.

> Download File from UI broken after pagination
> -
>
> Key: HDFS-9953
> URL: https://issues.apache.org/jira/browse/HDFS-9953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Attachments: HDFS-9953.patch
>
>
>  File links not working on second page onwards. this was introduced in 
> HDFS-9084.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194183#comment-15194183
 ] 

Hadoop QA commented on HDFS-9918:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 13s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 35s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 40s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 193m 50s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortStripedBlock(LocatedStripedBlock)
 invokes inefficient new Integer(int) constructor; use Integer.valueOf(int) 
instead  At DatanodeManager.java:constructor; use Integer.valueOf(int) instead  
At DatanodeManager.java:[line 447] |
| JDK v1.8.0_74 Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   |

[jira] [Commented] (HDFS-9953) Download File from UI broken after pagination

2016-03-14 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194160#comment-15194160
 ] 

Ravi Prakash commented on HDFS-9953:


Thanks for the fix Brahma and for pointing me to the issue Chris. I was trying 
to slip this fix in https://issues.apache.org/jira/browse/HDFS-7987 . +1 . 
LGTM. Will commit shortly

> Download File from UI broken after pagination
> -
>
> Key: HDFS-9953
> URL: https://issues.apache.org/jira/browse/HDFS-9953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Attachments: HDFS-9953.patch
>
>
>  File links not working on second page onwards. this was introduced in 
> HDFS-9084.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9959) add log when block removed from last live datanode

2016-03-14 Thread yunjiong zhao (JIRA)

yunjiong zhao created HDFS-9959:
---

 Summary: add log when block removed from last live datanode
 Key: HDFS-9959
 URL: https://issues.apache.org/jira/browse/HDFS-9959
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: yunjiong zhao
Assignee: yunjiong zhao
Priority: Minor


Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9941) Do not log StandbyException on NN, other minor logging fixes

2016-03-14 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194084#comment-15194084
 ] 

Arpit Agarwal commented on HDFS-9941:
-

Thank you [~cnauroth].

> Do not log StandbyException on NN, other minor logging fixes
> 
>
> Key: HDFS-9941
> URL: https://issues.apache.org/jira/browse/HDFS-9941
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.8.0
>
> Attachments: HDFS-9941-branch-2.03.patch, HDFS-9941.01.patch, 
> HDFS-9941.02.patch, HDFS-9941.03.patch
>
>
> The NameNode can skip logging StandbyException messages. These are seen 
> regularly in normal operation and convey no useful information.
> We no longer log the locations of newly allocated blocks in 2.8.0. The DN IDs 
> can be useful for debugging so let's add that back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9005) Provide support for upgrade domain script

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194071#comment-15194071
 ] 

Hadoop QA commented on HDFS-9005:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-hdfs-project: patch generated 2 new + 436 
unchanged - 9 fixed = 438 total (was 445) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 34s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 59s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s 
{color} | {color:red} Patch generated 1 ASF License

[jira] [Updated] (HDFS-9945) Datanode command for evicting writers

2016-03-14 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9945:
-
Attachment: HDFS-9945.v2.patch

Fixed the findbug warning and two checkstyle issues.

> Datanode command for evicting writers
> -
>
> Key: HDFS-9945
> URL: https://issues.apache.org/jira/browse/HDFS-9945
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9945.patch, HDFS-9945.v2.patch
>
>
> It will be useful if there is a command to evict writers from a datanode. 
> When a set of datanodes are being decommissioned, they can get blocked by 
> slow writers at the end.  It was rare in the old days since mapred jobs 
> didn't last too long, but with many different types of apps running on 
> today's YARN cluster, we are often see very long tail in datanode 
> decommissioning.
> I propose a new dfsadmin command, {{evictWriters}}, to be added. I initially 
> thought about having namenode automatically telling datanodes on 
> decommissioning, but realized that having a command is more flexible. E.g. 
> users can choose not to do this at all, choose when to evict writers, or 
> whether to try multiple times for whatever reasons.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9945) Datanode command for evicting writers

2016-03-14 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194043#comment-15194043
 ] 

Kihwal Lee commented on HDFS-9945:
--

Pre-commit actually ran: 
https://builds.apache.org/job/PreCommit-HDFS-Build/14795/
It didn't post the result probably because of the jira issue at that time.

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote ||  Subsystem ||  Runtime   || Comment ||
|   0  |reexec  |  0m 57s| Docker mode activated. 
|  +1  |   @author  |  0m 0s | The patch does not contain any @author 
|  ||| tags.
|  +1  |test4tests  |  0m 0s | The patch appears to include 1 new or 
|  ||| modified test files.
|   0  |mvndep  |  0m 10s| Maven dependency ordering for branch 
|  +1  |mvninstall  |  6m 51s| trunk passed 
|  +1  |   compile  |  1m 18s| trunk passed with JDK v1.8.0_74 
|  +1  |   compile  |  1m 21s| trunk passed with JDK v1.7.0_95 
|  +1  |checkstyle  |  0m 33s| trunk passed 
|  +1  |   mvnsite  |  1m 27s| trunk passed 
|  +1  |mvneclipse  |  0m 27s| trunk passed 
|  +1  |  findbugs  |  3m 37s| trunk passed 
|  +1  |   javadoc  |  1m 25s| trunk passed with JDK v1.8.0_74 
|  +1  |   javadoc  |  2m 11s| trunk passed with JDK v1.7.0_95 
|   0  |mvndep  |  0m 9s | Maven dependency ordering for patch 
|  +1  |mvninstall  |  1m 18s| the patch passed 
|  +1  |   compile  |  1m 13s| the patch passed with JDK v1.8.0_74 
|  +1  |cc  |  1m 13s| the patch passed 
|  +1  | javac  |  1m 13s| the patch passed 
|  +1  |   compile  |  1m 19s| the patch passed with JDK v1.7.0_95 
|  +1  |cc  |  1m 19s| the patch passed 
|  +1  | javac  |  1m 19s| the patch passed 
|  -1  |checkstyle  |  0m 31s| hadoop-hdfs-project: patch generated 4 
|  ||| new + 538 unchanged - 4 fixed = 542 total
|  ||| (was 542)
|  +1  |   mvnsite  |  1m 22s| the patch passed 
|  +1  |mvneclipse  |  0m 22s| the patch passed 
|  +1  |whitespace  |  0m 0s | Patch has no whitespace issues. 
|  -1  |  findbugs  |  2m 11s| hadoop-hdfs-project/hadoop-hdfs 
|  ||| generated 1 new + 0 unchanged - 0 fixed =
|  ||| 1 total (was 0)
|  +1  |   javadoc  |  1m 23s| the patch passed with JDK v1.8.0_74 
|  +1  |   javadoc  |  2m 8s | the patch passed with JDK v1.7.0_95 
|  +1  |  unit  |  0m 49s| hadoop-hdfs-client in the patch passed 
|  ||| with JDK v1.8.0_74.
|  -1  |  unit  |  55m 54s   | hadoop-hdfs in the patch failed with JDK 
|  ||| v1.8.0_74.
|  +1  |  unit  |  0m 56s| hadoop-hdfs-client in the patch passed 
|  ||| with JDK v1.7.0_95.
|  +1  |  unit  |  53m 36s   | hadoop-hdfs in the patch passed with JDK 
|  ||| v1.7.0_95.
|  +1  |asflicense  |  0m 22s| Patch does not generate ASF License 
|  ||| warnings.
|  ||  149m 8s   | 
\\
\\
||Reason || Tests ||
| FindBugs  |  module:hadoop-hdfs-project/hadoop-hdfs |
|   |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.xceiver; locked 50% of time 
Unsynchronized access at DataXceiver.java:50% of time Unsynchronized access at 
DataXceiver.java:\[line 222\] |
| JDK v1.8.0_74 Failed junit tests  |  
hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   |  hadoop.hdfs.TestFileAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12792899/HDFS-9945.patch |
| JIRA Issue | HDFS-9945 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux be2432938a57 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 658ee95 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_74 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14795/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14795/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
 |
| unit |

[jira] [Updated] (HDFS-9709) DiskBalancer : Add tests for disk balancer using a Mock Mover class.

2016-03-14 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9709:
---
Status: Open  (was: Patch Available)

needs to make some changes to the current patch. Will resubmit it after that.

> DiskBalancer : Add tests for disk balancer using a Mock Mover class.
> 
>
> Key: HDFS-9709
> URL: https://issues.apache.org/jira/browse/HDFS-9709
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9709-HDFS-1312.001.patch
>
>
> Add tests cases for DiskBalancer using a Mock Mover class. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9709) DiskBalancer : Add tests for disk balancer using a Mock Mover class.

2016-03-14 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9709:

Fix Version/s: (was: HDFS-1312)
   Status: Patch Available  (was: Open)

> DiskBalancer : Add tests for disk balancer using a Mock Mover class.
> 
>
> Key: HDFS-9709
> URL: https://issues.apache.org/jira/browse/HDFS-9709
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9709-HDFS-1312.001.patch
>
>
> Add tests cases for DiskBalancer using a Mock Mover class. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9703) DiskBalancer : getBandwidth implementation

2016-03-14 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-9703:

  Resolution: Fixed
Hadoop Flags: Reviewed
Target Version/s:   (was: HDFS-1312)
  Status: Resolved  (was: Patch Available)

Committed to the feature branch. Thanks [~anu].

> DiskBalancer : getBandwidth implementation
> --
>
> Key: HDFS-9703
> URL: https://issues.apache.org/jira/browse/HDFS-9703
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-1312
>
> Attachments: HDFS-9703-HDFS-1312.001.patch, 
> HDFS-9703-HDFS-1312.002.patch, HDFS-9703-HDFS-1312.003.patch
>
>
> Add getBandwidth call



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9703) DiskBalancer : getBandwidth implementation

2016-03-14 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194024#comment-15194024
 ] 

Arpit Agarwal commented on HDFS-9703:
-

+1 for the v3 patch. checkstyle issues can be ignored, we use this pattern 
commonly in Hadoop. Test failures are unrelated.

> DiskBalancer : getBandwidth implementation
> --
>
> Key: HDFS-9703
> URL: https://issues.apache.org/jira/browse/HDFS-9703
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-1312
>
> Attachments: HDFS-9703-HDFS-1312.001.patch, 
> HDFS-9703-HDFS-1312.002.patch, HDFS-9703-HDFS-1312.003.patch
>
>
> Add getBandwidth call



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9955) DataNode won't self-heal after some block dirs were manually misplaced

2016-03-14 Thread David Watzke (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Watzke updated HDFS-9955:
---
Description: 
I have accidentally ran this tool on top of DataNode's datadirs (of a datanode 
that was shut down at the moment): 
https://github.com/killerwhile/volume-balancer

The tool makes assumptions about block directory placement that are no longer 
valid in hadoop 2.6.0 and it was just moving them around between different 
datadirs to make the disk usage balanced. OK, it was not a good idea to run it 
but my concern is the way the datanode was (not) handling the resulting state. 
I've seen these messages in DN log (see below) which means DN knew about this 
but didn't do anything to fix it (self-heal by copying the other replica) - 
which seems like a bug to me. If you need any additional info please just ask.

{noformat}
2016-03-04 12:40:06,008 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-A.B.C.D-1375882473930:blk_-3159875140074863904_0 on volume 
/data/18/cdfs/dn
2016-03-04 12:40:06,009 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-A.B.C.D-1375882473930:blk_8369468090548520777_0 on volume 
/data/18/cdfs/dn
2016-03-04 12:40:06,011 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-A.B.C.D-1375882473930:blk_1226431637_0 on volume 
/data/18/cdfs/dn
2016-03-04 12:40:06,012 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-A.B.C.D-1375882473930:blk_1169332185_0 on volume 
/data/18/cdfs/dn
2016-03-04 12:40:06,825 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opReadBlock BP-680964103-A.B.C.D-1375882473930:blk_1226781281_1099829669050 
received exception java.io.IOException: BlockId 1226781281 is not valid.
2016-03-04 12:40:06,825 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(X.Y.Z.30, 
datanodeUuid=9da950ca-87ae-44ee-9391-0bca669c796b, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-56;cid=cluster12;nsid=1625487778;c=1438754073236):Got exception 
while serving BP-680964103-A.B.C.D-1375882473930:blk_1226781281_1099829669050 
to /X.Y.Z.30:48146
java.io.IOException: BlockId 1226781281 is not valid.
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
at java.lang.Thread.run(Thread.java:745)
2016-03-04 12:40:06,826 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
prg04-002.xyz.tld:50010:DataXceiver error processing READ_BLOCK operation  src: 
/X.Y.Z.30:48146 dst: /X.Y.Z.30:50010
java.io.IOException: BlockId 1226781281 is not valid.
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
at java.lang.Thread.run(Thread.java:745)
{noformat}

  was:
I have accidentally ran this tool on top of DataNode's datadirs (of a datanode 
that was shut down at the moment): 
https://github.com/killerwhile/volume-balancer

The tool makes assumptions about block directory placement that are no longer 
valid in hadoop 2.6.0 and it was just moving them around between different 
datadirs to make the disk usage balanced. OK, it was not a good idea to run it 
but my concern is the way the datanode was (not) handling the resulting state. 
I've seen these messages in DN log (see below) which means DN knew about this 
but didn't do anything to fix it

[jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.

2016-03-14 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193954#comment-15193954
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9917:
---

When SNN is restarted, DNs send a full BR to it.  Then, the IBRs collected 
before the full BR can be dropped.  Is it the case?

> IBR accumulate more objects when SNN was down for sometime.
> ---
>
> Key: HDFS-9917
> URL: https://issues.apache.org/jira/browse/HDFS-9917
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> SNN was down for sometime because of some reasons..After restarting SNN,it 
> became unreponsive because 
> - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), 
> where as each datanode had only ~2.5 million blocks.
> - GC can't trigger on this objects since all will be under RPC queue. 
> To recover this( to clear this objects) ,restarted all the DN's one by 
> one..This issue happened in 2.4.1 where split of blockreport was not 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9944) Ozone : Add container dispatcher

2016-03-14 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193947#comment-15193947
 ] 

Anu Engineer commented on HDFS-9944:


[~cnauroth] Thanks for the code review comments. This new patch fixes all 
issues flagged by you.

> Ozone : Add container dispatcher
> 
>
> Key: HDFS-9944
> URL: https://issues.apache.org/jira/browse/HDFS-9944
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9944-HDFS-7240.001.patch, 
> HDFS-9944-HDFS-7240.002.patch
>
>
> This patch takes a request packet from the network layer and delivers it to 
> the container manager. ie. OzoneContainerManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-03-14 Thread Kuhu Shukla (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193948#comment-15193948
 ] 

Kuhu Shukla commented on HDFS-9958:
---

The patch applies to branch-2.7 only.

> BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed 
> storages.
> 
>
> Key: HDFS-9958
> URL: https://issues.apache.org/jira/browse/HDFS-9958
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9958-Test-v1.txt
>
>
> In a scenario where the corrupt replica is on a failed storage, before it is 
> taken out of blocksMap, there is a race which causes the creation of 
> LocatedBlock on a {{machines}} array element that is not populated. 
> Following is the root cause,
> {code}
> final int numCorruptNodes = countNodes(blk).corruptReplicas();
> {code}
> countNodes only looks at nodes with storage state as NORMAL, which in the 
> case where corrupt replica is on failed storage will amount to 
> numCorruptNodes being zero. 
> {code}
> final int numNodes = blocksMap.numNodes(blk);
> {code}
> However, numNodes will count all nodes/storages irrespective of the state of 
> the storage. Therefore numMachines will include such (failed) nodes. The 
> assert would fail only if the system is enabled to catch Assertion errors, 
> otherwise it goes ahead and tries to create LocatedBlock object for that is 
> not put in the {{machines}} array.
> Here is the stack trace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:45)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:40)
>   at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.(LocatedBlock.java:84)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:878)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:826)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:799)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:899)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1849)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9944) Ozone : Add container dispatcher

2016-03-14 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9944:
---
Attachment: HDFS-9944-HDFS-7240.002.patch

> Ozone : Add container dispatcher
> 
>
> Key: HDFS-9944
> URL: https://issues.apache.org/jira/browse/HDFS-9944
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9944-HDFS-7240.001.patch, 
> HDFS-9944-HDFS-7240.002.patch
>
>
> This patch takes a request packet from the network layer and delivers it to 
> the container manager. ie. OzoneContainerManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-03-14 Thread Kuhu Shukla (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated HDFS-9958:
--
Attachment: HDFS-9958-Test-v1.txt

Attaching a sample test that recreates the issue. Please note the test is as of 
now not robust to false positives.

> BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed 
> storages.
> 
>
> Key: HDFS-9958
> URL: https://issues.apache.org/jira/browse/HDFS-9958
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9958-Test-v1.txt
>
>
> In a scenario where the corrupt replica is on a failed storage, before it is 
> taken out of blocksMap, there is a race which causes the creation of 
> LocatedBlock on a {{machines}} array element that is not populated. 
> Following is the root cause,
> {code}
> final int numCorruptNodes = countNodes(blk).corruptReplicas();
> {code}
> countNodes only looks at nodes with storage state as NORMAL, which in the 
> case where corrupt replica is on failed storage will amount to 
> numCorruptNodes being zero. 
> {code}
> final int numNodes = blocksMap.numNodes(blk);
> {code}
> However, numNodes will count all nodes/storages irrespective of the state of 
> the storage. Therefore numMachines will include such (failed) nodes. The 
> assert would fail only if the system is enabled to catch Assertion errors, 
> otherwise it goes ahead and tries to create LocatedBlock object for that is 
> not put in the {{machines}} array.
> Here is the stack trace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:45)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:40)
>   at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.(LocatedBlock.java:84)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:878)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:826)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:799)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:899)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1849)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-03-14 Thread Kuhu Shukla (JIRA)

Kuhu Shukla created HDFS-9958:
-

 Summary: BlockManager#createLocatedBlocks can throw NPE for 
corruptBlocks on failed storages.
 Key: HDFS-9958
 URL: https://issues.apache.org/jira/browse/HDFS-9958
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.2
Reporter: Kuhu Shukla
Assignee: Kuhu Shukla


In a scenario where the corrupt replica is on a failed storage, before it is 
taken out of blocksMap, there is a race which causes the creation of 
LocatedBlock on a {{machines}} array element that is not populated. 

Following is the root cause,
{code}
final int numCorruptNodes = countNodes(blk).corruptReplicas();
{code}
countNodes only looks at nodes with storage state as NORMAL, which in the case 
where corrupt replica is on failed storage will amount to numCorruptNodes being 
zero. 
{code}
final int numNodes = blocksMap.numNodes(blk);
{code}
However, numNodes will count all nodes/storages irrespective of the state of 
the storage. Therefore numMachines will include such (failed) nodes. The assert 
would fail only if the system is enabled to catch Assertion errors, otherwise 
it goes ahead and tries to create LocatedBlock object for that is not put in 
the {{machines}} array.

Here is the stack trace:
{code}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:45)
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:40)
at 
org.apache.hadoop.hdfs.protocol.LocatedBlock.(LocatedBlock.java:84)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:878)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:826)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:799)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:899)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1849)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9941) Do not log StandbyException on NN, other minor logging fixes

2016-03-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193891#comment-15193891
 ] 

Hudson commented on HDFS-9941:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9459 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9459/])
HDFS-9941. Do not log StandbyException on NN, other minor logging fixes. 
(cnauroth: rev 5644137adad30c84e40d2c4719627b3aabc73628)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java


> Do not log StandbyException on NN, other minor logging fixes
> 
>
> Key: HDFS-9941
> URL: https://issues.apache.org/jira/browse/HDFS-9941
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.8.0
>
> Attachments: HDFS-9941-branch-2.03.patch, HDFS-9941.01.patch, 
> HDFS-9941.02.patch, HDFS-9941.03.patch
>
>
> The NameNode can skip logging StandbyException messages. These are seen 
> regularly in normal operation and convey no useful information.
> We no longer log the locations of newly allocated blocks in 2.8.0. The DN IDs 
> can be useful for debugging so let's add that back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9926) ozone : Add volume commands to CLI

2016-03-14 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9926:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I have committed this to the HDFS-7240 feature branch.  [~anu], thank you for 
the patch.

> ozone : Add volume commands to CLI
> --
>
> Key: HDFS-9926
> URL: https://issues.apache.org/jira/browse/HDFS-9926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9926-HDFS-7240.001.patch, 
> HDFS-9926-HDFS-7240.002.patch, HDFS-9926-HDFS-7240.003.patch
>
>
> Adds a cli tool which supports volume commands



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9941) Do not log StandbyException on NN, other minor logging fixes

2016-03-14 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9941:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

+1 for both the trunk patch and the branch-2 patch.  I confirmed that the test 
failures in the last pre-commit run are unrelated and don't repro.  I have 
committed this to trunk, branch-2 and branch-2.8.  [~arpitagarwal], thank you 
for contributing this patch.

> Do not log StandbyException on NN, other minor logging fixes
> 
>
> Key: HDFS-9941
> URL: https://issues.apache.org/jira/browse/HDFS-9941
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: 2.8.0
>
> Attachments: HDFS-9941-branch-2.03.patch, HDFS-9941.01.patch, 
> HDFS-9941.02.patch, HDFS-9941.03.patch
>
>
> The NameNode can skip logging StandbyException messages. These are seen 
> regularly in normal operation and convey no useful information.
> We no longer log the locations of newly allocated blocks in 2.8.0. The DN IDs 
> can be useful for debugging so let's add that back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193706#comment-15193706
 ] 

Chris Nauroth commented on HDFS-9947:
-

Regarding the pre-commit -1 for no tests, I'm willing to let it slide.  This is 
a change in logging only, the code is unlikely to change often, and the method 
contract is documented well.  I'm still +1.

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-9947.001.patch
>
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193686#comment-15193686
 ] 

Hadoop QA commented on HDFS-9947:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 59s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 9s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793330/HDFS-9947.001.patch |
| JIRA Issue | HDFS-9947 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 792dafb41f08 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Updated] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9947:

Hadoop Flags: Reviewed

+1 for the patch pending pre-commit run.  [~cmccabe], thank you.

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-9947.001.patch
>
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9953) Download File from UI broken after pagination

2016-03-14 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9953:

Target Version/s: 2.8.0
Priority: Blocker  (was: Major)
 Component/s: namenode

I'm flagging this as a blocker for 2.8.0, because it's a regression.  
[~raviprak], [~wheat9] and [~ozawa], would you remind code reviewing the patch 
since you have the prior context from participating on HDFS-9084?  Thank you.

> Download File from UI broken after pagination
> -
>
> Key: HDFS-9953
> URL: https://issues.apache.org/jira/browse/HDFS-9953
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Attachments: HDFS-9953.patch
>
>
>  File links not working on second page onwards. this was introduced in 
> HDFS-9084.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER

2016-03-14 Thread sanjay kenganahalli vamanna (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193640#comment-15193640
 ] 

sanjay kenganahalli vamanna commented on HDFS-9956:
---


ha.zookeeper.session-timeout.ms, default is 5 secs, this default has to be 
greater than hadoop.security.group.mapping.ldap.directory.search.timeout 
(default 10 sec). We increased "ha.zookeeper.session-timeout.ms" to 20 secs but 
still have an issue.


> LDAP PERFORMANCE ISSUE AND FAIL OVER
> 
>
> Key: HDFS-9956
> URL: https://issues.apache.org/jira/browse/HDFS-9956
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: sanjay kenganahalli vamanna
>
> The typical LDAP group name resolution works well under typical scenarios. 
> However, we have seen cases where a user is mapped to many groups (in an 
> extreme case, a user is mapped to more than 100 groups). The way it's being 
> implemented now makes this case super slow resolving groups from 
> ActiveDirectory and making the namenode to failover.
> Instead of failover, we can use the 
> parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to 
> time-out and send the failed response back to the user so that we can prevent 
> name node failover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-9957) HDFS's use of mlock() is not portable

2016-03-14 Thread Alan Burlison (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Burlison reassigned HDFS-9957:
---

Assignee: Alan Burlison

> HDFS's use of mlock() is not portable
> -
>
> Key: HDFS-9957
> URL: https://issues.apache.org/jira/browse/HDFS-9957
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: native
>Affects Versions: 2.7.2
> Environment: Any UNIX system other than Linux
>Reporter: Alan Burlison
>Assignee: Alan Burlison
>
> HDFS uses mlock() to lock in the memory used to back java.nio.Buffer. 
> Unfortunately the way it is done is not standards-compliant. As the Linux 
> manpage for mlock() says:
> {quote}
>Under Linux, mlock(), mlock2(), and munlock() automatically round
>addr down to the nearest page boundary.  However, the POSIX.1
>specification of mlock() and munlock() allows an implementation to
>require that addr is page aligned, so portable applications should
>ensure this.
> {quote}
> The HDFS code does not do any such alignment, nor is it true that the backing 
> buffers for java.nio.Buffer are necessarily page aligned. And even if the 
> address was aligned by the code, it would end up calling mlock() on other 
> random JVM data structures that shared the same page. That seems potentially 
> dangerous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9957) HDFS's use of mlock() is not portable

2016-03-14 Thread Alan Burlison (JIRA)

Alan Burlison created HDFS-9957:
---

 Summary: HDFS's use of mlock() is not portable
 Key: HDFS-9957
 URL: https://issues.apache.org/jira/browse/HDFS-9957
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: native
Affects Versions: 2.7.2
 Environment: Any UNIX system other than Linux
Reporter: Alan Burlison


HDFS uses mlock() to lock in the memory used to back java.nio.Buffer. 
Unfortunately the way it is done is not standards-compliant. As the Linux 
manpage for mlock() says:

{quote}
   Under Linux, mlock(), mlock2(), and munlock() automatically round
   addr down to the nearest page boundary.  However, the POSIX.1
   specification of mlock() and munlock() allows an implementation to
   require that addr is page aligned, so portable applications should
   ensure this.
{quote}

The HDFS code does not do any such alignment, nor is it true that the backing 
buffers for java.nio.Buffer are necessarily page aligned. And even if the 
address was aligned by the code, it would end up calling mlock() on other 
random JVM data structures that shared the same page. That seems potentially 
dangerous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER

2016-03-14 Thread sanjay kenganahalli vamanna (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193620#comment-15193620
 ] 

sanjay kenganahalli vamanna commented on HDFS-9956:
---

the default 10 secs is not working and still we are facing the same problem 
from past so many days.We dont want to keep the users in static binding and we 
dont want to use the unix shell mapping as well.

> LDAP PERFORMANCE ISSUE AND FAIL OVER
> 
>
> Key: HDFS-9956
> URL: https://issues.apache.org/jira/browse/HDFS-9956
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: sanjay kenganahalli vamanna
>
> The typical LDAP group name resolution works well under typical scenarios. 
> However, we have seen cases where a user is mapped to many groups (in an 
> extreme case, a user is mapped to more than 100 groups). The way it's being 
> implemented now makes this case super slow resolving groups from 
> ActiveDirectory and making the namenode to failover.
> Instead of failover, we can use the 
> parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to 
> time-out and send the failed response back to the user so that we can prevent 
> name node failover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-14 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193601#comment-15193601
 ] 

Rakesh R commented on HDFS-9918:


Thank you [~zhz] for the useful comments. I've modified the sorting logic using 
{{blkIndex2LocationsMap}} map, also addressed other comments. Please review the 
latest patch again.

> Erasure Coding: Sort located striped blocks based on decommissioned states
> --
>
> Key: HDFS-9918
> URL: https://issues.apache.org/jira/browse/HDFS-9918
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch
>
>
> This jira is a follow-on work of HDFS-8786, where we do decommissioning of 
> datanodes having striped blocks.
> Now, after decommissioning it requires to change the ordering of the storage 
> list so that the decommissioned datanodes should only be last node in list.
> For example, assume we have a block group with storage list:-
> d0, d1, d2, d3, d4, d5, d6, d7, d8, d9
> mapping to indices
> 0, 1, 2, 3, 4, 5, 6, 7, 8, 2
> Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a 
> decommissioning node then should switch d2 and d9 in the storage list.
> Thanks [~jingzhao] for the 
> [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-03-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193595#comment-15193595
 ] 

Colin Patrick McCabe edited comment on HDFS-9668 at 3/14/16 4:37 PM:
-

Thanks for revising this, [~jingcheng...@intel.com].  I think that it looks 
much better now that it is no longer a separate dataset implementation.  I 
revoke my -1.

A 10 gigabyte HDFS file that uses 5 MB HDFS blocks seems like an extremely 
unusual case.  That would result in just that single file having 2,097,152 
blocks.  I guess perhaps this is intended to simulate a case where we have many 
small files leading to small blocks?

One thing that I can see about this code is that there are many cases where we 
could drop the lock earlier than we do.  For example, in this function:

{code}
  @Override // FsDatasetSpi
  public synchronized Block getStoredBlock(String bpid, long blkid)
  throws IOException {
File blockfile = getFile(bpid, blkid, false);
if (blockfile == null) {
  return null;
}
final File metafile = FsDatasetUtil.findMetaFile(blockfile);
final long gs = FsDatasetUtil.parseGenerationStamp(blockfile, metafile);
return new Block(blkid, blockfile.length(), gs);
  }
{code}

The only thing that needs to be protected by the lock is the call to 
{{FsDatasetImpl#getFile}}, since it reads from the {{volumeMap}}.  
{{FsDatasetUtil#findMetaFile}} doesn't need protection since it just lists the 
block files in the directory, and {{parseGenerationStamp}} just applies a 
regular expression to the metadata file name.

There are a lot of other cases like this.  I think reducing the unnecessary 
locking would be better than making the locking more complex.  After all, even 
with lock striping, we may find that several "hot" blocks share the same lock 
stripe, and therefore that we gain no more concurrency.  I wonder what numbers 
you get if you just change these functions to drop the lock except when they 
really need it to access the {{volumeMap}}?

I notice that this patch adds a reader/writer lock.  While this allows many 
concurrent readers, it seems like it could allow starvation of writer threads.  
If we are going to use an R/W lock, I think we should choose a "fair" R/W lock 
to avoid this issue.


was (Author: cmccabe):
Thanks for revising this, [~jingcheng...@intel.com].  I think that it looks 
much better now that it is no longer a separate dataset implementation.  I 
revoke my -1.

A 10 gigabyte HDFS file that uses 5 MB HDFS blocks seems like an extremely 
unusual case.  That would result in just that single file having 2,097,152 
blocks.  I guess perhaps this is intended to simulate a case where we have many 
small files leading to small blocks?

One thing that I can see about this code is that there are many cases where we 
could drop the lock earlier than we do.  For example, in this function:

{code}
  @Override // FsDatasetSpi
  public synchronized Block getStoredBlock(String bpid, long blkid)
  throws IOException {
File blockfile = getFile(bpid, blkid, false);
if (blockfile == null) {
  return null;
}
final File metafile = FsDatasetUtil.findMetaFile(blockfile);
final long gs = FsDatasetUtil.parseGenerationStamp(blockfile, metafile);
return new Block(blkid, blockfile.length(), gs);
  }
{code}

The only thing that needs to be protected by the lock is the call to 
{{FsDatasetImpl#getFile}}, since it reads from the {{volumeMap}}.  
{{FsDatasetUtil#findMetaFile}} doesn't need protection since it just lists the 
block files in the directory, and {{parseGenerationStamp}} just applies a 
regular expression to the metadata file name.

There are a lot of other cases like this.  I think reducing the unnecessary 
locking would be better than making the locking more complex.  After all, even 
with lock striping, we may find that several "hot" blocks share the same lock 
stripe, and therefore that we gain no more concurrency.  I wonder what numbers 
you get if you just change these functions to drop the lock except when they 
really need it to access the {{volumeMap}}?

I notice that this patch adds a reader/writer lock.  While this allows many 
concurrent readers, it seems like it could allow starvation of writer threads.  
If we are going to use an R/W lock, I think we should choose a fair R/W lock to 
avoid this issue.

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and

[jira] [Updated] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9668:
---
Summary: Optimize the locking in FsDatasetImpl  (was: Many long-time 
BLOCKED threads on FsDatasetImpl in a tiered storage test)

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
>   - locked <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
> {noformat}
> We measured the execution of some operations in FsDatasetImpl during the 
> test. Here following is the result.
> !execution_time.png!
> The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy 
> load take a really long time.
> It means one slow operation of finalizeBlock, addBlock and createRbw in a 
> slow storage can block all the other same operations in the same DataNode, 
> especially in HBase when many wal/flusher/compactor are configured.
> We need a finer grained lock mechanism in a new FsDatasetImpl implementation 
> and users can choose the implementation by configuring 
> "dfs.datanode.fsdataset.factory" in DataNode.
> We can implement the lock by either storage level or block-level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9668) Many long-time BLOCKED threads on FsDatasetImpl in a tiered storage test

2016-03-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193595#comment-15193595
 ] 

Colin Patrick McCabe commented on HDFS-9668:


Thanks for revising this, [~jingcheng...@intel.com].  I think that it looks 
much better now that it is no longer a separate dataset implementation.  I 
revoke my -1.

A 10 gigabyte HDFS file that uses 5 MB HDFS blocks seems like an extremely 
unusual case.  That would result in just that single file having 2,097,152 
blocks.  I guess perhaps this is intended to simulate a case where we have many 
small files leading to small blocks?

One thing that I can see about this code is that there are many cases where we 
could drop the lock earlier than we do.  For example, in this function:

{code}
  @Override // FsDatasetSpi
  public synchronized Block getStoredBlock(String bpid, long blkid)
  throws IOException {
File blockfile = getFile(bpid, blkid, false);
if (blockfile == null) {
  return null;
}
final File metafile = FsDatasetUtil.findMetaFile(blockfile);
final long gs = FsDatasetUtil.parseGenerationStamp(blockfile, metafile);
return new Block(blkid, blockfile.length(), gs);
  }
{code}

The only thing that needs to be protected by the lock is the call to 
{{FsDatasetImpl#getFile}}, since it reads from the {{volumeMap}}.  
{{FsDatasetUtil#findMetaFile}} doesn't need protection since it just lists the 
block files in the directory, and {{parseGenerationStamp}} just applies a 
regular expression to the metadata file name.

There are a lot of other cases like this.  I think reducing the unnecessary 
locking would be better than making the locking more complex.  After all, even 
with lock striping, we may find that several "hot" blocks share the same lock 
stripe, and therefore that we gain no more concurrency.  I wonder what numbers 
you get if you just change these functions to drop the lock except when they 
really need it to access the {{volumeMap}}?

I notice that this patch adds a reader/writer lock.  While this allows many 
concurrent readers, it seems like it could allow starvation of writer threads.  
If we are going to use an R/W lock, I think we should choose a fair R/W lock to 
avoid this issue.

> Many long-time BLOCKED threads on FsDatasetImpl in a tiered storage test
> 
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
>

[jira] [Updated] (HDFS-9918) Erasure Coding: Sort located striped blocks based on decommissioned states

2016-03-14 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9918:
---
Attachment: HDFS-9918-002.patch

> Erasure Coding: Sort located striped blocks based on decommissioned states
> --
>
> Key: HDFS-9918
> URL: https://issues.apache.org/jira/browse/HDFS-9918
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9918-001.patch, HDFS-9918-002.patch
>
>
> This jira is a follow-on work of HDFS-8786, where we do decommissioning of 
> datanodes having striped blocks.
> Now, after decommissioning it requires to change the ordering of the storage 
> list so that the decommissioned datanodes should only be last node in list.
> For example, assume we have a block group with storage list:-
> d0, d1, d2, d3, d4, d5, d6, d7, d8, d9
> mapping to indices
> 0, 1, 2, 3, 4, 5, 6, 7, 8, 2
> Here the internal block b2 is duplicated, locating in d2 and d9. If d2 is a 
> decommissioning node then should switch d2 and d9 in the storage list.
> Thanks [~jingzhao] for the 
> [discussions|https://issues.apache.org/jira/browse/HDFS-8786?focusedCommentId=15180415=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180415]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER

2016-03-14 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193587#comment-15193587
 ] 

Wei-Chiu Chuang commented on HDFS-9956:
---

That is implemented in HADOOP-9322, and you should be able to use it since 
Hadoop 2.1.0-beta, or if you're using CDH, >= CDH 4.3.0.

> LDAP PERFORMANCE ISSUE AND FAIL OVER
> 
>
> Key: HDFS-9956
> URL: https://issues.apache.org/jira/browse/HDFS-9956
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: sanjay kenganahalli vamanna
>
> The typical LDAP group name resolution works well under typical scenarios. 
> However, we have seen cases where a user is mapped to many groups (in an 
> extreme case, a user is mapped to more than 100 groups). The way it's being 
> implemented now makes this case super slow resolving groups from 
> ActiveDirectory and making the namenode to failover.
> Instead of failover, we can use the 
> parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to 
> time-out and send the failed response back to the user so that we can prevent 
> name node failover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER

2016-03-14 Thread sanjay kenganahalli vamanna (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193579#comment-15193579
 ] 

sanjay kenganahalli vamanna commented on HDFS-9956:
---

Thanks for replying, Which version of hadoop,this parameter is there.

> LDAP PERFORMANCE ISSUE AND FAIL OVER
> 
>
> Key: HDFS-9956
> URL: https://issues.apache.org/jira/browse/HDFS-9956
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: sanjay kenganahalli vamanna
>
> The typical LDAP group name resolution works well under typical scenarios. 
> However, we have seen cases where a user is mapped to many groups (in an 
> extreme case, a user is mapped to more than 100 groups). The way it's being 
> implemented now makes this case super slow resolving groups from 
> ActiveDirectory and making the namenode to failover.
> Instead of failover, we can use the 
> parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to 
> time-out and send the failed response back to the user so that we can prevent 
> name node failover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9005) Provide support for upgrade domain script

2016-03-14 Thread Ming Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-9005:
--
Attachment: HDFS-9005-4.patch

New patch to fix more checkstyle issues. For asflicense issue, it is due to the 
test json file, which won't allow comments. If that is absolutely required, we 
can modify the test case to generate the json file at run time.

> Provide support for upgrade domain script
> -
>
> Key: HDFS-9005
> URL: https://issues.apache.org/jira/browse/HDFS-9005
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9005-2.patch, HDFS-9005-3.patch, HDFS-9005-4.patch, 
> HDFS-9005.patch
>
>
> As part of the upgrade domain feature, we need to provide a mechanism to 
> specify upgrade domain for each datanode. One way to accomplish that is to 
> allow admins specify an upgrade domain script that takes DN ip or hostname as 
> input and return the upgrade domain. Then namenode will use it at run time to 
> set {{DatanodeInfo}}'s upgrade domain string. The configuration can be 
> something like:
> {noformat}
> 
> dfs.namenode.upgrade.domain.script.file.name
> /etc/hadoop/conf/upgrade-domain.sh
> 
> {noformat}
> just like topology script, 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER

2016-03-14 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193536#comment-15193536
 ] 

Wei-Chiu Chuang commented on HDFS-9956:
---

Hi [~sanjayvamanna] thanks for reporting the issue and offering workarounds.

The parameter {{hadoop.security.group.mapping.ldap.directory.search.timeout}} 
is supposed to stop queries if it goes over time. Would this parameter work in 
your scenario? 

> LDAP PERFORMANCE ISSUE AND FAIL OVER
> 
>
> Key: HDFS-9956
> URL: https://issues.apache.org/jira/browse/HDFS-9956
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: sanjay kenganahalli vamanna
>
> The typical LDAP group name resolution works well under typical scenarios. 
> However, we have seen cases where a user is mapped to many groups (in an 
> extreme case, a user is mapped to more than 100 groups). The way it's being 
> implemented now makes this case super slow resolving groups from 
> ActiveDirectory and making the namenode to failover.
> Instead of failover, we can use the 
> parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to 
> time-out and send the failed response back to the user so that we can prevent 
> name node failover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9951) Use string constants for XML tags in OfflineImageReconstructor

2016-03-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193531#comment-15193531
 ] 

Colin Patrick McCabe commented on HDFS-9951:


It seems reasonable to me.  It seems like {{PBImageXmlWriter.java}} should use 
the same set of string constants.  This will also ensure that the strings used 
during serialization are the same as the ones used during deserialization.

> Use string constants for XML tags in OfflineImageReconstructor
> --
>
> Key: HDFS-9951
> URL: https://issues.apache.org/jira/browse/HDFS-9951
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Minor
> Attachments: HDFS-9551.001.patch
>
>
> In class {{OfflineImageReconstructor}}, it uses many {{SectionProcessors}} to 
> process xml files and load the subtree of the XML into a Node structure. But 
> there are lots of places that node removes key by directively writing value 
> in methods rather than define them first. Like this:
> {code}
> Node expiration = directive.removeChild("expiration");
> {code}
> We could improve this to define them in Node and them invoked like this way:
> {code}
> Node expiration=directive.removeChild(Node.CACHE_MANAGER_SECTION_EXPIRATION);
> {code}
> And it will be good to manager node key's name in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9951) Use string constants for XML tags in OfflineImageReconstructor

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9951:
---
Summary: Use string constants for XML tags in OfflineImageReconstructor  
(was: Improve the Node key definition in OfflineImageReconstructor)

> Use string constants for XML tags in OfflineImageReconstructor
> --
>
> Key: HDFS-9951
> URL: https://issues.apache.org/jira/browse/HDFS-9951
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Minor
> Attachments: HDFS-9551.001.patch
>
>
> In class {{OfflineImageReconstructor}}, it uses many {{SectionProcessors}} to 
> process xml files and load the subtree of the XML into a Node structure. But 
> there are lots of places that node removes key by directively writing value 
> in methods rather than define them first. Like this:
> {code}
> Node expiration = directive.removeChild("expiration");
> {code}
> We could improve this to define them in Node and them invoked like this way:
> {code}
> Node expiration=directive.removeChild(Node.CACHE_MANAGER_SECTION_EXPIRATION);
> {code}
> And it will be good to manager node key's name in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER

2016-03-14 Thread sanjay kenganahalli vamanna (JIRA)

sanjay kenganahalli vamanna created HDFS-9956:
-

 Summary: LDAP PERFORMANCE ISSUE AND FAIL OVER
 Key: HDFS-9956
 URL: https://issues.apache.org/jira/browse/HDFS-9956
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: sanjay kenganahalli vamanna


The typical LDAP group name resolution works well under typical scenarios. 
However, we have seen cases where a user is mapped to many groups (in an 
extreme case, a user is mapped to more than 100 groups). The way it's being 
implemented now makes this case super slow resolving groups from 
ActiveDirectory and making the namenode to failover.
Instead of failover, we can use the parameter(ha.zookeeper.session-timeout.ms) 
in the getgroups method to time-out and send the failed response back to the 
user so that we can prevent name node failover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2016-03-14 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193520#comment-15193520
 ] 

Colin Patrick McCabe commented on HDFS-9350:


Thanks, [~cnauroth].  I uploaded the patch to HDFS-9947.  I would have done it 
earlier, but JIRA was a bit unresponsive.

> Avoid creating temprorary strings in Block.toString() and getBlockName()
> 
>
> Key: HDFS-9350
> URL: https://issues.apache.org/jira/browse/HDFS-9350
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Minor
> Fix For: 2.9.0
>
> Attachments: HDFS-9350.001.patch
>
>
> Minor change to use StringBuilders directly to avoid creating temporary 
> strings of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9947:
---
Attachment: HDFS-9947.001.patch

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-9947.001.patch
>
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9947:
---
Status: Patch Available  (was: Open)

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-9947.001.patch
>
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9947:
---
Comment: was deleted

(was: dupe of HDFS-9947 .?)

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HDFS-9947) Block#toString should not output information from derived classes

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9947:
---
Comment: was deleted

(was: HDFS-9948 is dupe of this issue..)

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9947
> URL: https://issues.apache.org/jira/browse/HDFS-9947
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-9948) Block#toString should not output information from derived classes

2016-03-14 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved HDFS-9948.

Resolution: Duplicate

> Block#toString should not output information from derived classes
> -
>
> Key: HDFS-9948
> URL: https://issues.apache.org/jira/browse/HDFS-9948
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
>
> {{Block#toString}} should not output information from derived classes.  
> Thanks for [~cnauroth] for spotting this bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-9556) libhdfs++: pull Options from default configs by default

2016-03-14 Thread Bob Hansen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9556:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed with 7751507233dc7888c8.  Thanks for the review, [~James Clampffer]

> libhdfs++: pull Options from default configs by default
> ---
>
> Key: HDFS-9556
> URL: https://issues.apache.org/jira/browse/HDFS-9556
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9556.HDFS-8707.000.patch, 
> HDFS-9556.HDFS-8707.002.patch, HDFS-9556.HDFS-8707.003.patch, 
> HDFS-9556.HDFS-9932.003.patch
>
>
> Include method to connect to defaultFS from configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-9932) libhdfs++: find a URI parsing library

2016-03-14 Thread Bob Hansen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen resolved HDFS-9932.
--
Resolution: Fixed

Committed with 196db0d6d41e27421a

> libhdfs++: find a URI parsing library
> -
>
> Key: HDFS-9932
> URL: https://issues.apache.org/jira/browse/HDFS-9932
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9932.HDFS-8707.000.patch
>
>
> The URI parsing implementation in HDFS-9556 using regex requires gcc 4.9+, 
> which seems a bit too steep at the moment.  Find some code to parse URIs so 
> we don't have to roll our own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9954) Test RPC timeout fix of HADOOP-12672 against HDFS

2016-03-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193387#comment-15193387
 ] 

Hadoop QA commented on HDFS-9954:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 3s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
22s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 42s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m 30s 
{color} | {color:red} root-jdk1.8.0_74 with JDK v1.8.0_74 generated 2 new + 738 
unchanged - 0 fixed = 740 total (was 738) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 17m 51s 
{color} | {color:red} root-jdk1.7.0_95 with JDK v1.7.0_95 generated 2 new + 734 
unchanged - 0 fixed = 736 total (was 734) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
5s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 29s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 1s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 27s 
{color} | {color:green}

[jira] [Updated] (HDFS-9955) DataNode won't self-heal after some block dirs were manually misplaced

2016-03-14 Thread David Watzke (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Watzke updated HDFS-9955:
---
Description: 
I have accidentally ran this tool on top of DataNode's datadirs (of a datanode 
that was shut down at the moment): 
https://github.com/killerwhile/volume-balancer

The tool makes assumptions about block directory placement that are no longer 
valid in hadoop 2.6.0 and it was just moving them around between different 
datadirs to make the disk usage balanced. OK, it was not a good idea to run it 
but my concern is the way the datanode was (not) handling the resulting state. 
I've seen these messages in DN log (see below) which means DN knew about this 
but didn't do anything to fix it (self-heal by copying the other replica) - 
which seems like a bug to me. If you need any additional info please just ask.

{noformat}
2016-03-04 12:40:06,008 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-77.234.46.18-1375882473930:blk_-3159875140074863904_0 on 
volume /data/18/cdfs/dn
2016-03-04 12:40:06,009 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-77.234.46.18-1375882473930:blk_8369468090548520777_0 on 
volume /data/18/cdfs/dn
2016-03-04 12:40:06,011 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-77.234.46.18-1375882473930:blk_1226431637_0 on volume 
/data/18/cdfs/dn
2016-03-04 12:40:06,012 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-77.234.46.18-1375882473930:blk_1169332185_0 on volume 
/data/18/cdfs/dn
2016-03-04 12:40:06,825 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opReadBlock 
BP-680964103-77.234.46.18-1375882473930:blk_1226781281_1099829669050 received 
exception java.io.IOException: BlockId 1226781281 is not valid.
2016-03-04 12:40:06,825 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(5.45.56.30, 
datanodeUuid=9da950ca-87ae-44ee-9391-0bca669c796b, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-56;cid=cluster12;nsid=1625487778;c=1438754073236):Got exception 
while serving 
BP-680964103-77.234.46.18-1375882473930:blk_1226781281_1099829669050 to 
/5.45.56.30:48146
java.io.IOException: BlockId 1226781281 is not valid.
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
at java.lang.Thread.run(Thread.java:745)
2016-03-04 12:40:06,826 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
prg04-002.ff.avast.com:50010:DataXceiver error processing READ_BLOCK operation  
src: /5.45.56.30:48146 dst: /5.45.56.30:50010
java.io.IOException: BlockId 1226781281 is not valid.
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
at java.lang.Thread.run(Thread.java:745)
{noformat}

  was:
I have accidentally ran this tool on top of DataNode's datadirs (of a datanode 
that was shut down at the moment): 
https://github.com/killerwhile/volume-balancer

The tool makes assumptions about block directory placement that are no longer 
valid in hadoop 2.6.0 and it was just moving them around between different 
datadirs to make the disk usage balanced. OK, it was not a good idea to run it 
but my concern is the way the datanode was (not) handling the resulting state. 
I've seen these messages in DN log (see below) which means DN knew

[jira] [Created] (HDFS-9955) DataNode won't self-heal after some block dirs were manually misplaced

2016-03-14 Thread David Watzke (JIRA)

David Watzke created HDFS-9955:
--

 Summary: DataNode won't self-heal after some block dirs were 
manually misplaced
 Key: HDFS-9955
 URL: https://issues.apache.org/jira/browse/HDFS-9955
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
 Environment: CentOS 6, Cloudera 5.4.4 (patched Hadoop 2.6.0)
Reporter: David Watzke


I have accidentally ran this tool on top of DataNode's datadirs (of a datanode 
that was shut down at the moment): 
https://github.com/killerwhile/volume-balancer

The tool makes assumptions about block directory placement that are no longer 
valid in hadoop 2.6.0 and it was just moving them around between different 
datadirs to make the disk usage balanced. OK, it was not a good idea to run it 
but my concern is the way the datanode was (not) handling the resulting state. 
I've seen these messages in DN log (see below) which means DN knew about this 
but didn't do anything to fix it (self-heal by copying the other replica) - 
which seems like a bug to me. If you need any additional info please just ask.


2016-03-04 12:40:06,008 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-77.234.46.18-1375882473930:blk_-3159875140074863904_0 on 
volume /data/18/cdfs/dn
2016-03-04 12:40:06,009 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-77.234.46.18-1375882473930:blk_8369468090548520777_0 on 
volume /data/18/cdfs/dn
2016-03-04 12:40:06,011 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-77.234.46.18-1375882473930:blk_1226431637_0 on volume 
/data/18/cdfs/dn
2016-03-04 12:40:06,012 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
block BP-680964103-77.234.46.18-1375882473930:blk_1169332185_0 on volume 
/data/18/cdfs/dn
2016-03-04 12:40:06,825 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opReadBlock 
BP-680964103-77.234.46.18-1375882473930:blk_1226781281_1099829669050 received 
exception java.io.IOException: BlockId 1226781281 is not valid.
2016-03-04 12:40:06,825 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(5.45.56.30, 
datanodeUuid=9da950ca-87ae-44ee-9391-0bca669c796b, infoPort=50075, 
ipcPort=50020, 
storageInfo=lv=-56;cid=cluster12;nsid=1625487778;c=1438754073236):Got exception 
while serving 
BP-680964103-77.234.46.18-1375882473930:blk_1226781281_1099829669050 to 
/5.45.56.30:48146
java.io.IOException: BlockId 1226781281 is not valid.
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
at java.lang.Thread.run(Thread.java:745)
2016-03-04 12:40:06,826 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
prg04-002.ff.avast.com:50010:DataXceiver error processing READ_BLOCK operation  
src: /5.45.56.30:48146 dst: /5.45.56.30:50010
java.io.IOException: BlockId 1226781281 is not valid.
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 119 matches

Mail list logo