[jira] [Commented] (HDFS-12826) Document Saying the RPC port, But it's required IPC port in Balancer Document.

2017-11-22 Thread usharani (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262252#comment-16262252
 ] 

usharani commented on HDFS-12826:
-

[~vagarychen] thanks for taking a look.
bq. changing ipc to rpc instead?
Currently it's documented with {{rpc}} which can confuse as there is no 
configuration for DN. So changing to {{ipc}} which will be more clear.

> Document Saying the RPC port, But it's required IPC port in Balancer Document.
> --
>
> Key: HDFS-12826
> URL: https://issues.apache.org/jira/browse/HDFS-12826
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, documentation
>Affects Versions: 3.0.0-beta1
>Reporter: Harshakiran Reddy
>Assignee: usharani
>Priority: Minor
> Attachments: HDFS-12826.patch
>
>
> In {{Adding a new Namenode to an existing HDFS cluster}} , refreshNamenodes 
> command required IPC port but in Documentation it's saying the RPC port.
> http://hadoop.apache.org/docs/r3.0.0-beta1/hadoop-project-dist/hadoop-hdfs/Federation.html#Balancer
> {noformat} 
> bin>:~/hdfsdata/HA/install/hadoop/datanode/bin> ./hdfs dfsadmin 
> -refreshNamenodes host-name:65110
> refreshNamenodes: Unknown protocol: 
> org.apache.hadoop.hdfs.protocol.ClientDatanodeProtocol
> bin.:~/hdfsdata/HA/install/hadoop/datanode/bin> ./hdfs dfsadmin 
> -refreshNamenodes
> Usage: hdfs dfsadmin [-refreshNamenodes datanode-host:ipc_port]
> bin>:~/hdfsdata/HA/install/hadoop/datanode/bin> ./hdfs dfsadmin 
> -refreshNamenodes host-name:50077
> bin>:~/hdfsdata/HA/install/hadoop/datanode/bin>
> {noformat} 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12853) Ozone: Optimize chunk writes for Ratis by avoiding double writes

2017-11-22 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-12853:
-
Description: 
Ozone, in replicated mode writes the data twice, once to the raft log and then 
to the statemachine.
This results in the data being written twice during a particular chunk write, 
this is subobtimal. With RATIS-122 the statemachine in Ozone can be optimized 
by writing the data to the statemachine only once.

  was:
Ozone, in replicated mode writes the data twice, once to the raft log and then 
to the statemachine.
This results in the data being written twice during a particular chunk write, 
this is subobtimal. With Ratis-122 the statemachine in Ozone can be optimized 
by writing the data to the statemachine only once.


> Ozone: Optimize chunk writes for Ratis by avoiding double writes
> 
>
> Key: HDFS-12853
> URL: https://issues.apache.org/jira/browse/HDFS-12853
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
>
> Ozone, in replicated mode writes the data twice, once to the raft log and 
> then to the statemachine.
> This results in the data being written twice during a particular chunk write, 
> this is subobtimal. With RATIS-122 the statemachine in Ozone can be optimized 
> by writing the data to the statemachine only once.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12851) Ozone: Upgrade to latest ratis build

2017-11-22 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-12851:
-
Attachment: HDFS-12851-HDFS-7240.001.patch

> Ozone: Upgrade to latest ratis build
> 
>
> Key: HDFS-12851
> URL: https://issues.apache.org/jira/browse/HDFS-12851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12851-HDFS-7240.001.patch
>
>
> Ratis build on Ozone needs to be upgraded to incorporate memory leak 
> fixes(RATIS-135, RATIS-116), state machine optimization(RATIS-122) and Async 
> client api (RATIS-113)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12741) ADD support for KSM --createObjectStore command

2017-11-22 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262167#comment-16262167
 ] 

Yiqun Lin commented on HDFS-12741:
--

Thanks for working on this, [~shashikant]. Overall looks good to me but finding 
many nitpicks, Here are my initial comments for this:
*KSMStorage .java*
line36. It would be better to reuse scm id variable; 
{code}  
public static final String SCM_ID = "scmUuid";  -->   public static final 
String SCM_ID = SCMStorage.SCM_ID;
{code} 
line40: Comment is incorrect: {{Construct SCMStorage.}}

*KeySpaceManager.java*
line99 and line101: Can you format these lines?
line127: Error message is not accurate. We maybe need define a new error type.
line291: Please use {{equal}} check instead of {{==}}
line294: "+" is not needed.
line300: Need a space before {{Updating}} and {{SCM}}.
line309: "+" is not needed
line323: "argument" should be {{Argument}}

*StorageContainerManager.java*: The change in this class is not related on this 
JIRA. Can you remove this?

Other comment: Can you add a space in the beginning in all comments that added? 
For example:
{noformat}
//Read the version file info from KSM version file --> // Read the version file 
info from KSM version file
{noformat}
This is just one minor code style problem.

In addition, the description of this JIRA seems not correct:
bq. The SCM version file is stored in the KSM metadata directory and before 
communicating...
>From reviewing on this, I found here KSM version file is stored in the SCM 
>metadata directory not as description memtioned.




> ADD support for KSM --createObjectStore command
> ---
>
> Key: HDFS-12741
> URL: https://issues.apache.org/jira/browse/HDFS-12741
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
> Fix For: HDFS-7240
>
> Attachments: HDFS-12741-HDFS-7240.001.patch
>
>
> KSM --createObjectStore command reads the ozone configuration information and 
> creates the KSM version file and reads the SCM version file from the SCM 
> specified.
>   
> The SCM version file is stored in the KSM metadata directory and before 
> communicating with an SCM KSM verifies that it is communicating with an SCM 
> where the relationship has been established via createObjectStore command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7240) Object store in HDFS

2017-11-22 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263485#comment-16263485
 ] 

Aaron Fabbri commented on HDFS-7240:


Thanks for taking notes, and thank you for the lively discussion.  A lot of 
valid concerns on all sides here. A couple of minor corrections:
{quote}AaronF: Ozone is a lot of new code and Hadoop already has so much code.. 
{quote}
My concerns particularly are around things that affect stability and 
operability. Not concerned about lines of code, more about "how manageable" is 
the codebase in terms of getting a stable release, and supporting in production.

{quote}
shallow data copy is practical only if within same project and daemon otherwise 
have deal with security setting and coordinations across daemons.
{quote}
We can factor common code, if any, to a shared dependency. I don't see how 
which repository code lives in really affects fast copy between storage 
systems.  I can think of ways to do it both within a JVM process consisting of 
code from multiple git repositories, and via IPC (hand off ownership of a file 
to another process--not even talking about fancy stuff like shmem).

{quote}
The opponents will raise the same issue as today: show feature parity 
{quote}

I get your concern, but I didn't hear anyone say feature parity. I only heard 
"integrate with HDFS".  Even integrated with HDFS, there is still a high bar of 
"utility" to pass, IMO, to justify a very large patch which affects production 
code.

We all want stable, scalable HDFS. Nobody opposes that ideal. 

I'm not sure trying to evolve HDFS to scale is a better approach than being 
separate with maybe some shared, well-factored dependencies.  The latter, IMO, 
could result in better code and dramatically less risk to HDFS.

Appreciate all your hard work thus far and appreciate the challenges you guys 
face here. I hope you can understand my perspective as well.

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, 
> MeetingMinutes.pdf, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12856) BlockReconstructionWork.chooseTargets() violates namesystem locking

2017-11-22 Thread Konstantin Shvachko (JIRA)
Konstantin Shvachko created HDFS-12856:
--

 Summary: BlockReconstructionWork.chooseTargets() violates 
namesystem locking
 Key: HDFS-12856
 URL: https://issues.apache.org/jira/browse/HDFS-12856
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.4
Reporter: Konstantin Shvachko


{{BlockReconstructionWork.chooseTargets()}} is called outside namesystem lock, 
although it works with {{DatanodeDescriptor}} and {{DatanodeStorageInfo}}, 
which can change.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263553#comment-16263553
 ] 

Tsz Wo Nicholas Sze commented on HDFS-12638:


Since this is about NameNode failed by NullPointerException, I agree that it is 
a blocker.

> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>Priority: Blocker
> Attachments: HDFS-12638-branch-2.8.2.001.patch, HDFS-12638.002.patch, 
> OphanBlocksAfterTruncateDelete.jpg
>
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12836) startTxId could be greater than endTxId when tailing in-progress edit log

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263526#comment-16263526
 ] 

Hadoop QA commented on HDFS-12836:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  2s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 14 unchanged - 0 fixed = 17 total (was 14) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}124m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}195m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | hadoop.hdfs.TestRead |
|   | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestDFSOutputStream |
|   | hadoop.hdfs.TestReadWhileWriting |
|   | hadoop.security.TestPermissionSymlinks |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.security.token.block.TestBlockToken |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSShell |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 |
|   | hadoop.hdfs.security.TestDelegationToken |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12836 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898928/HDFS-12836.2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 90c6cdaebb3b 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality 

[jira] [Commented] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263573#comment-16263573
 ] 

Hadoop QA commented on HDFS-12832:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 24m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.7 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
34s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
37s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} branch-2.7 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 188 unchanged - 1 fixed = 188 total (was 189) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 60 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 19s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
56s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}127m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:19 |
| Failed junit tests | hadoop.hdfs.TestClientReportBadBlock |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.web.TestWebHdfsTokens |
| Timed out junit tests | org.apache.hadoop.hdfs.TestSetrepDecreasing |
|   | org.apache.hadoop.hdfs.TestFileAppend4 |
|   | org.apache.hadoop.hdfs.TestRollingUpgradeDowngrade |
|   | org.apache.hadoop.hdfs.TestReadWhileWriting |
|   | org.apache.hadoop.hdfs.TestFileCorruption |
|   | org.apache.hadoop.hdfs.TestLease |
|   | org.apache.hadoop.hdfs.TestHDFSServerPorts |
|   | org.apache.hadoop.hdfs.TestDFSUpgrade |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.TestAppendSnapshotTruncate |
|   | org.apache.hadoop.hdfs.TestRollingUpgradeRollback |
|   | org.apache.hadoop.hdfs.TestFSOutputSummer |
|   | org.apache.hadoop.hdfs.TestMiniDFSCluster |
|   | org.apache.hadoop.hdfs.TestBlockReaderFactory |
|   | org.apache.hadoop.hdfs.TestHFlush |
|   | org.apache.hadoop.hdfs.TestEncryptedTransfer |
|   | org.apache.hadoop.hdfs.TestDFSShell |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.TestHDFSTrash |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-12832 |
| JIRA Patch URL | 

[jira] [Commented] (HDFS-12856) BlockReconstructionWork.chooseTargets() violates namesystem locking

2017-11-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263497#comment-16263497
 ] 

Konstantin Shvachko commented on HDFS-12856:


This to address [~xkrogen]'s 
[comment|https://issues.apache.org/jira/browse/HDFS-12832?focusedCommentId=16259458=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16259458]
 in HDFS-12832.
I see two possible solutions:
# Surround the call with the lock
# Reorganize {{BlockReconstructionWork}} so that it stored only elementary 
fields, like ids, to allow more fine-grained locking inside {{chooseTargets()}}

We should also deprecate {{BlockPlacementPolicy.chooseTargets()}} methods with 
{{srcPath}} parameter, since it is unused in all known (to me) variants of the 
policy.

> BlockReconstructionWork.chooseTargets() violates namesystem locking
> ---
>
> Key: HDFS-12856
> URL: https://issues.apache.org/jira/browse/HDFS-12856
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.4
>Reporter: Konstantin Shvachko
>
> {{BlockReconstructionWork.chooseTargets()}} is called outside namesystem 
> lock, although it works with {{DatanodeDescriptor}} and 
> {{DatanodeStorageInfo}}, which can change.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12299) Race Between update pipeline and DN Re-Registration

2017-11-22 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-12299:
---
Attachment: HDFS-12299.001.branch-2.7.patch

I made a branch-2.7 patch. Kindly review. It's sort of the HDFS-12299 + 
HDFS-11817.
Wanted to backport HDFS-11817 as well, but that itself is a pretty big change.

> Race Between update pipeline and DN Re-Registration
> ---
>
> Key: HDFS-12299
> URL: https://issues.apache.org/jira/browse/HDFS-12299
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Critical
>  Labels: release-blocker
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.2
>
> Attachments: HDFS-12299-branch-2-002.patch, 
> HDFS-12299-branch-2.patch, HDFS-12299.001.branch-2.7.patch, HDFS-12299.patch
>
>
>  *Scenario*   
>  - Started pipeline with DN1->DN2->DN3
>  - DN1 is re-reg and update pipeline is called
>  - Update pipeline will success with DN1->DN3->DN4
>  - Again update pipeline is called,which will fail with NPE.
> In step3 updatepipeline will set the storages as null since DN1 re-reg(which 
> will remove and add the storages)
> {{FSNamesystem#updatePipelineInternal}}
> {code}
>lastBlock.getUnderConstructionFeature().setExpectedLocations(lastBlock,
> storages, lastBlock.getBlockType())
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263522#comment-16263522
 ] 

Junping Du commented on HDFS-12638:
---

[~jingzhao] and [~szetszwo], any comments on Konstantin's proposal above?

> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>Priority: Blocker
> Attachments: HDFS-12638-branch-2.8.2.001.patch, HDFS-12638.002.patch, 
> OphanBlocksAfterTruncateDelete.jpg
>
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263537#comment-16263537
 ] 

Hadoop QA commented on HDFS-12665:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-9806 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
57s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
24s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
10s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
40s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
16s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-9806 
has 1 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
48s{color} | {color:red} hadoop-tools/hadoop-fs2img in HDFS-9806 has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
8s{color} | {color:green} HDFS-9806 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 10s{color} | {color:orange} root: The patch generated 15 new + 626 unchanged 
- 0 fixed = 641 total (was 626) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 1 new 
+ 1 unchanged - 0 fixed = 2 total (was 1) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
18s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
27s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| 

[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263561#comment-16263561
 ] 

Konstantin Shvachko commented on HDFS-12638:


I looked more closely through the changes introduced in HDFS-9754. It looks 
like a major refactoring with a goal which seems like a minor optimization, and 
with no test coverage. The main problem is that it violated the key invariant, 
which is mentioned in the [first 
comment|https://issues.apache.org/jira/browse/HDFS-9754?focusedCommentId=15131492=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15131492]
 of that jira, stating
* *A block is always associated with a file when added to {{BlocksMap}}*

I propose to revert HDFS-9754. I think the invariant is important and should be 
preserved.
Reverting is not trivial, since more refactoring accumulated on top of it, but 
possible.

The patch I posted here is still needed since we should remove the extra block 
in truncate case. It is not as critical though, since this is not the cause of 
NPE and the truncate block is eventually deleted, when DN finishes truncate or 
on the next block report. So to resolve this blocker for upcoming releases 
(2.8, 2.9, 3.0) we need to complete the revert.

> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>Priority: Blocker
> Attachments: HDFS-12638-branch-2.8.2.001.patch, HDFS-12638.002.patch, 
> OphanBlocksAfterTruncateDelete.jpg
>
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263596#comment-16263596
 ] 

Hadoop QA commented on HDFS-12832:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.7 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
29s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} branch-2.7 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 188 unchanged - 1 fixed = 188 total (was 189) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 60 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 32s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
50s{color} | {color:red} The patch generated 7 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}121m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:17 |
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeInitStorage |
|   | hadoop.hdfs.server.datanode.TestRefreshNamenodes |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
| Timed out junit tests | org.apache.hadoop.hdfs.server.datanode.TestHSync |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting 
|
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | org.apache.hadoop.hdfs.server.datanode.TestReadOnlySharedStorage |
|   | org.apache.hadoop.hdfs.TestPread |
|   | org.apache.hadoop.hdfs.TestDecommission |
|   | org.apache.hadoop.hdfs.TestDFSAddressConfig |
|   | org.apache.hadoop.hdfs.TestDFSUpgrade |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | org.apache.hadoop.hdfs.server.datanode.TestIncrementalBlockReports |
|   | org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | org.apache.hadoop.hdfs.TestDFSRollback |
|   | org.apache.hadoop.hdfs.server.datanode.TestBlockScanner |
|   | org.apache.hadoop.hdfs.server.datanode.TestCachingStrategy |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | org.apache.hadoop.hdfs.TestAbandonBlock |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:67e87c9 |
| JIRA Issue | 

[jira] [Commented] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263524#comment-16263524
 ] 

Hadoop QA commented on HDFS-12832:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
52s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 51s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}155m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.fs.TestUnbuffer |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12832 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898935/HDFS-12832.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6f585d9ef76d 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 738d1a2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22170/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22170/testReport/ |
| Max. process+thread count | 3882 (vs. ulimit of 5000) |
| modules | C: 

[jira] [Commented] (HDFS-9754) Avoid unnecessary getBlockCollection calls in BlockManager

2017-11-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263565#comment-16263565
 ] 

Konstantin Shvachko commented on HDFS-9754:
---

Proposing to revert this in HDFS-12638. [See this 
comment|https://issues.apache.org/jira/browse/HDFS-12638?focusedCommentId=16263561=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16263561]
 for more details.

> Avoid unnecessary getBlockCollection calls in BlockManager
> --
>
> Key: HDFS-9754
> URL: https://issues.apache.org/jira/browse/HDFS-9754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.9.0, 3.0.0-alpha1, 2.8.2
>
> Attachments: HDFS-9754.000.patch, HDFS-9754.001.patch, 
> HDFS-9754.002.patch
>
>
> Currently BlockManager calls {{Namesystem#getBlockCollection}} in order to:
> 1. check if the block has already been abandoned
> 2. identify the storage policy of the block
> 3. meta save
> For #1 we can use BlockInfo's internal state instead of checking if the 
> corresponding file still exists.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263499#comment-16263499
 ] 

Konstantin Shvachko commented on HDFS-12832:


Also file HDFS-12856 to further address [~xkrogen]'s comment above, which still 
remains valid after fixing {{ArrayIndexOutOfBoundException}} addressed in this 
jira.

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Assignee: Konstantin Shvachko
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-branch-2.002.patch, 
> HDFS-12832-branch-2.7.002.patch, HDFS-12832-trunk-001.patch, 
> HDFS-12832.002.patch, exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7240) Object store in HDFS

2017-11-22 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263515#comment-16263515
 ] 

Anu Engineer commented on HDFS-7240:


h1. Ozone - Second community meeting
Time: Friday, November 17, 2017, at 4:00 pm PST


_Participants: Arpit Agarwal, Robert Boyd, Wei Chiu, Marton Elek, Anu Engineer, 
Aaron  Fabri, Manoj Govindassamy, Virajith Jalaparti, Aaron  Myers, Jitendra  
Pandey, Sanjay Radia, ChaoSun, Bharat Viswanadham, Andrew Wang, Lei (Eddy) 
Xu, Wei Yan, Xiaoyu Yao. \[Apologies to anyone that I might have missed who 
joined over phone\]_


We started the meeting with discussing the notions of ozone's block storage 
layer and followed by the deep dive into the code. 
We discussed the notions of the block layer, which is similar to HDFS block 
layer, Ozone's container layer and how replication works via pipelines. Then we 
did a code walk-thru of the ozone codebase, starting with KSM, SCM, Container 
layer and Rest handler.

We had some technical questions about containers.  Is the unit of replication 
the containers, and if we can truncate a block that is already part of 
containers, say block three inside a container. Both of these were answered in 
affirmative, that the unit of the replication is indeed a container and you can 
terminate block three inside the container without any issues.


Once we finished the technical discussion, we discussed some of the merge 
issues; essentially the question was whether we should postpone the merge of 
ozone into HDFS.

* Andrew Wang wanted to know how this would benefit the enterprise customers?.
** It was pointed out that customers can use the storage via a Hadoop 
Compatible filesystem (FileSystem or FileContext), and more important, apps 
such as  Hive and spark, etc which those APIs will work (we are testing Hive 
and Spark). In fact, all the data in ozone is expected to come via Hive, YARN, 
Spark, etc. Making ozone work seamlessly via such  Hadoop frameworks is very 
important because it enables real customer use.

* ATM objected to the Ozone’s merge, as wanted to see the new block layer 
integrated with the existing NN. He argued that creating the block layer is 
just the first phase, and separation of block layer inside Namenode needs to be 
done. He further argued that we should merge after Namenode block separation is 
completely done. 
** Sanjay refuted that a project of this size can only be implemented in 
phases. Fixing HDFS’s scalability in a fundamental way requires fixing both the 
Namespace layer and the block layer. We provide a simpler namespace (Key-Value) 
as an intermediate step to allow real customer usage via spark and hive, and 
also as a way of stabilizing the new block layer. This is a good consistency 
point for integration to start working on integrating with a hierarchical 
namespace of the NN.

* Aaron Fabbri was concerned that code is new and may not be stable and that 
the support load for HDFS is quite high. This would further destabilize HDFS.
** Sanjay’s Response It was pointed out that the feature is not on by default 
and that the code is in a separate module. Indeed new shareable parts like the 
new netty protocol-engine in the DN will replace the old thread-based protocol 
engine only with HDFS community’s blessing after it has been sufficiently been 
tested via the Ozone path.  Further,  Ozone can be kept disabled if so desired 
by a customer.

* ATM’s concern is that connecting the NN to the new block layer will require 
separating the NSM/BM lock  (a good thing to do) which is very hard to do.
** Sanjay’s response. This issue was raised and explained at yesterday’s 
meeting. A very strong coupling was added between block layer and namespace 
layer when we wrote the new block pipeline as part of the append work in 2010: 
the block length of each replica at finalizing time, esp under failures, has to 
be consistent. This is done in the central NN today (due to lack of raft/paxos 
like protocol in the original block layer). The new block-container layer uses 
the raft for consistency and no longer needs a central agent like the NN. Thus 
the new block-container layer’s built-in consistent state management eliminates 
this coupling and hence simplifies the separation of the lock.


> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, 
> HDFS-7240.002.patch, HDFS-7240.003.patch, HDFS-7240.003.patch, 
> HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch, 
> MeetingMinutes.pdf, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, 
> ozone_user_v0.pdf
>
>
> This jira proposes to add 

[jira] [Updated] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-12638:
--
Target Version/s: 2.8.3, 3.0.0, 2.9.1  (was: 2.8.3)

> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>Priority: Blocker
> Attachments: HDFS-12638-branch-2.8.2.001.patch, HDFS-12638.002.patch, 
> OphanBlocksAfterTruncateDelete.jpg
>
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12638) NameNode exits due to ReplicationMonitor thread received Runtime exception in ReplicationWork#chooseTargets

2017-11-22 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263649#comment-16263649
 ] 

Junping Du commented on HDFS-12638:
---

Should be a blocker for 2.9.1 and 3.0.0 as well. Add them into target version 
for tracking.

> NameNode exits due to ReplicationMonitor thread received Runtime exception in 
> ReplicationWork#chooseTargets
> ---
>
> Key: HDFS-12638
> URL: https://issues.apache.org/jira/browse/HDFS-12638
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>Priority: Blocker
> Attachments: HDFS-12638-branch-2.8.2.001.patch, HDFS-12638.002.patch, 
> OphanBlocksAfterTruncateDelete.jpg
>
>
> Active NamNode exit due to NPE, I can confirm that the BlockCollection passed 
> in when creating ReplicationWork is null, but I do not know why 
> BlockCollection is null, By view history I found 
> [HDFS-9754|https://issues.apache.org/jira/browse/HDFS-9754] remove judging  
> whether  BlockCollection is null.
> NN logs are as following:
> {code:java}
> 2017-10-11 16:29:06,161 ERROR [ReplicationMonitor] 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.ReplicationWork.chooseTargets(ReplicationWork.java:55)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1532)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1491)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3792)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3744)
> at java.lang.Thread.run(Thread.java:834)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12786) Ozone: add port/service names to the ksm/scm web ui

2017-11-22 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-12786:

Attachment: HDFS-12786-HDFS-7240.002.patch

> Ozone: add port/service names to the ksm/scm web ui
> ---
>
> Key: HDFS-12786
> URL: https://issues.apache.org/jira/browse/HDFS-12786
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Minor
> Attachments: HDFS-12786-HDFS-7240.001.patch, 
> HDFS-12786-HDFS-7240.002.patch
>
>
> Since HDFS-12655 an additional serviceNames field is available for all rpc 
> service via the metrics interface.
> This super small patch modifies to scm/ksm web ui to display this name.
> Instead of
> :9863
> We will display:
> ScmBlockLocationProtocolService (:9863)
> TESTING:
> Start dozone cluster and check the header of the rpc metrics section on the 
> web ui: http://localhost:9876/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12786) Ozone: add port/service names to the ksm/scm web ui

2017-11-22 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-12786:

Attachment: (was: /tmp/HDFS-12786-HDFS-7240.002.patch)

> Ozone: add port/service names to the ksm/scm web ui
> ---
>
> Key: HDFS-12786
> URL: https://issues.apache.org/jira/browse/HDFS-12786
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Minor
> Attachments: HDFS-12786-HDFS-7240.001.patch
>
>
> Since HDFS-12655 an additional serviceNames field is available for all rpc 
> service via the metrics interface.
> This super small patch modifies to scm/ksm web ui to display this name.
> Instead of
> :9863
> We will display:
> ScmBlockLocationProtocolService (:9863)
> TESTING:
> Start dozone cluster and check the header of the rpc metrics section on the 
> web ui: http://localhost:9876/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12786) Ozone: add port/service names to the ksm/scm web ui

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262490#comment-16262490
 ] 

Hadoop QA commented on HDFS-12786:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
19s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
29m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d11161b |
| JIRA Issue | HDFS-12786 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898846/HDFS-12786-HDFS-7240.002.patch
 |
| Optional Tests |  asflicense  shadedclient  |
| uname | Linux e9aa766c78ac 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-7240 / a20a3fe |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 333 (vs. ulimit of 5000) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22164/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Ozone: add port/service names to the ksm/scm web ui
> ---
>
> Key: HDFS-12786
> URL: https://issues.apache.org/jira/browse/HDFS-12786
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Minor
> Attachments: HDFS-12786-HDFS-7240.001.patch, 
> HDFS-12786-HDFS-7240.002.patch
>
>
> Since HDFS-12655 an additional serviceNames field is available for all rpc 
> service via the metrics interface.
> This super small patch modifies to scm/ksm web ui to display this name.
> Instead of
> :9863
> We will display:
> ScmBlockLocationProtocolService (:9863)
> TESTING:
> Start dozone cluster and check the header of the rpc metrics section on the 
> web ui: http://localhost:9876/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-12854) Centralized Cache Management should support ViewFileSystem.

2017-11-22 Thread Harshakiran Reddy (JIRA)
Harshakiran Reddy created HDFS-12854:


 Summary: Centralized Cache Management should support 
ViewFileSystem.
 Key: HDFS-12854
 URL: https://issues.apache.org/jira/browse/HDFS-12854
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: caching, hdfs
Affects Versions: 3.0.0-alpha1
Reporter: Harshakiran Reddy


Current {{ViewFileSystem}} does not support Centralized Cache Management, it 
will throw {{IllegalArgumentException}} .

{noformat} 
./hdfs cacheadmin -listPools
IllegalArgumentException: FileSystem viewfs://ClusterX/ is not an HDFS file 
system 
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12786) Ozone: add port/service names to the ksm/scm web ui

2017-11-22 Thread Elek, Marton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262424#comment-16262424
 ] 

Elek, Marton commented on HDFS-12786:
-

Sure, I removed the colon. I was also hesitated about the right format. Thanks 
the review.

> Ozone: add port/service names to the ksm/scm web ui
> ---
>
> Key: HDFS-12786
> URL: https://issues.apache.org/jira/browse/HDFS-12786
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Minor
> Attachments: HDFS-12786-HDFS-7240.001.patch, 
> HDFS-12786-HDFS-7240.002.patch
>
>
> Since HDFS-12655 an additional serviceNames field is available for all rpc 
> service via the metrics interface.
> This super small patch modifies to scm/ksm web ui to display this name.
> Instead of
> :9863
> We will display:
> ScmBlockLocationProtocolService (:9863)
> TESTING:
> Start dozone cluster and check the header of the rpc metrics section on the 
> web ui: http://localhost:9876/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12854) Centralized Cache Management should support ViewFileSystem.

2017-11-22 Thread Sreejith MV (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262420#comment-16262420
 ] 

Sreejith MV commented on HDFS-12854:


I would like to work on this issue

> Centralized Cache Management should support ViewFileSystem.
> ---
>
> Key: HDFS-12854
> URL: https://issues.apache.org/jira/browse/HDFS-12854
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Harshakiran Reddy
>  Labels: HDFS, cache, viewfs
>
> Current {{ViewFileSystem}} does not support Centralized Cache Management, it 
> will throw {{IllegalArgumentException}} .
> {noformat} 
> ./hdfs cacheadmin -listPools
> IllegalArgumentException: FileSystem viewfs://ClusterX/ is not an HDFS file 
> system 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12681) Make HdfsLocatedFileStatus a subtype of LocatedFileStatus

2017-11-22 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262341#comment-16262341
 ] 

Steve Loughran commented on HDFS-12681:
---

It's nice to see the builder pattern here. I've been doing some stuff with 
subclasses of FileStatus recently and it's not great for subclassing.




> Make HdfsLocatedFileStatus a subtype of LocatedFileStatus
> -
>
> Key: HDFS-12681
> URL: https://issues.apache.org/jira/browse/HDFS-12681
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chris Douglas
>Assignee: Chris Douglas
>Priority: Minor
> Attachments: HDFS-12681.00.patch, HDFS-12681.01.patch, 
> HDFS-12681.02.patch, HDFS-12681.03.patch, HDFS-12681.04.patch, 
> HDFS-12681.05.patch, HDFS-12681.06.patch, HDFS-12681.07.patch, 
> HDFS-12681.08.patch, HDFS-12681.09.patch, HDFS-12681.10.patch, 
> HDFS-12681.11.patch, HDFS-12681.12.patch, HDFS-12681.13.patch
>
>
> {{HdfsLocatedFileStatus}} is a subtype of {{HdfsFileStatus}}, but not of 
> {{LocatedFileStatus}}. Conversion requires copying common fields and shedding 
> unknown data. It would be cleaner and sufficient for {{HdfsFileStatus}} to 
> extend {{LocatedFileStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263620#comment-16263620
 ] 

Konstantin Shvachko commented on HDFS-12832:


TestUnbuffer is tracked under HDFS-12815. Other tests succeeded locally.

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Assignee: Konstantin Shvachko
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-branch-2.002.patch, 
> HDFS-12832-branch-2.7.002.patch, HDFS-12832-trunk-001.patch, 
> HDFS-12832.002.patch, exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263903#comment-16263903
 ] 

Erik Krogen commented on HDFS-12832:


Hey [~shv], changes look good to me and thank you for filing the two follow-on 
JIRAs to address these rare but potentially disastrous race conditions. The 
lack of additional unit test seems very reasonable to me. I wonder if there is 
somewhere we should add developer documentation to indicate that this type of 
usage pattern is not safe?



> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Assignee: Konstantin Shvachko
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-branch-2.002.patch, 
> HDFS-12832-branch-2.7.002.patch, HDFS-12832-trunk-001.patch, 
> HDFS-12832.002.patch, exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-12780) Fix spelling mistake in DistCpUtils.java

2017-11-22 Thread Jianfei Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianfei Jiang reopened HDFS-12780:
--

> Fix spelling mistake in DistCpUtils.java
> 
>
> Key: HDFS-12780
> URL: https://issues.apache.org/jira/browse/HDFS-12780
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0-beta1
>Reporter: Jianfei Jiang
>  Labels: patch
> Attachments: HDFS-12780.patch
>
>
> We found a spelling mistake in DistCpUtils.java.  "* If checksums's can't be 
> retrieved," should be " * If checksums can't be retrieved,"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12641) Backport HDFS-11755 into branch-2.7 to fix a regression in HDFS-11445

2017-11-22 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262705#comment-16262705
 ] 

Brahma Reddy Battula commented on HDFS-12641:
-

[~jojochuang] thanks for working on this.

{{testcase}} pass irrespective of fix.anything I missed..?

> Backport HDFS-11755 into branch-2.7 to fix a regression in HDFS-11445
> -
>
> Key: HDFS-12641
> URL: https://issues.apache.org/jira/browse/HDFS-12641
> Project: Hadoop HDFS
>  Issue Type: Task
>Affects Versions: 2.7.4
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Blocker
>  Labels: release-blocker
> Attachments: HDFS-12641.branch-2.7.001.patch
>
>
> Our internal testing caught a regression in HDFS-11445 when we cherry picked 
> the commit into CDH. Basically, it produces bogus missing file warnings. 
> Further analysis revealed that the regression is actually fixed by HDFS-11755.
> Because of the order commits are merged in branch-2.8 ~ trunk (HDFS-11755 was 
> committed before HDFS-11445), the regression was never actually surfaced for 
> Hadoop 2.8/3.0.0-(alpha/beta) users. Since branch-2.7 has HDFS-11445 but no 
> HDFS-11755, I suspect the regression is more visible for Hadoop 2.7.4.
> I am filing this jira to raise more awareness, than simply backporting 
> HDFS-11755 into branch-2.7.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated HDFS-12754:
---
Attachment: HDFS-12754.009.patch

Thank you [~xiaochen], I missed putting up the version which had the nits 
addressed as well. Updated patch.

> Lease renewal can hit a deadlock 
> -
>
> Key: HDFS-12754
> URL: https://issues.apache.org/jira/browse/HDFS-12754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-12754.001.patch, HDFS-12754.002.patch, 
> HDFS-12754.003.patch, HDFS-12754.004.patch, HDFS-12754.005.patch, 
> HDFS-12754.006.patch, HDFS-12754.007.patch, HDFS-12754.008.patch, 
> HDFS-12754.009.patch
>
>
> The Client and the renewer can hit a deadlock during close operation since 
> closeFile() reaches back to the DFSClient#removeFileBeingWritten. This is 
> possible if the client class close when the renewer is renewing a lease.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12812) Ozone: dozone: initialize scm directory on cluster startup

2017-11-22 Thread Elek, Marton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262551#comment-16262551
 ] 

Elek, Marton commented on HDFS-12812:
-

The patch requires new base image. I uploaded the source of the baseimage to 
HADOOP-14898 as a proposed patch. It is available from dockerhub as 
elek/hadoop-runner but can be replaced with the official apache/hadoop-runner 
when it will be merged. (Or could be used to create the image locally).

> Ozone: dozone: initialize scm directory on cluster startup
> --
>
> Key: HDFS-12812
> URL: https://issues.apache.org/jira/browse/HDFS-12812
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
> Attachments: HADOOP-12812-HDFS-7240.001.patch
>
>
> HDFS-12739 fixed the scm, but after the patch it couldn't be started without 
> a separated `hdfs scm -init` any more. Unfortunatelly it breaks the 
> docker-compose functionality.
> This is handled int the starter script of the base runner docker image for 
> namenode. I also fixed this in the runner docker image 
> (https://github.com/elek/hadoop/commit/b347eb4bfca37d84dbcdd8c4bf353219d876a9b7)
>  will upload the improved patch to the HADOOP-14898.
> In this patch I just add a new environment variable to init the scm if the 
> version file doesn't exist.
> To test:
> Do a full build and go to the dev-support/compose/ozone:
> ```
> docker-compose pull
> docker-compose up -d
> ```
> Note: the pull is important as the new docker image -- which could handle the 
> new environment variable -- should be used



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12786) Ozone: add port/service names to the ksm/scm web ui

2017-11-22 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-12786:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7240
   Status: Resolved  (was: Patch Available)

Thanks [~elek] for the contribution. I've committed the patch to the feature 
branch.

> Ozone: add port/service names to the ksm/scm web ui
> ---
>
> Key: HDFS-12786
> URL: https://issues.apache.org/jira/browse/HDFS-12786
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Minor
> Fix For: HDFS-7240
>
> Attachments: HDFS-12786-HDFS-7240.001.patch, 
> HDFS-12786-HDFS-7240.002.patch
>
>
> Since HDFS-12655 an additional serviceNames field is available for all rpc 
> service via the metrics interface.
> This super small patch modifies to scm/ksm web ui to display this name.
> Instead of
> :9863
> We will display:
> ScmBlockLocationProtocolService (:9863)
> TESTING:
> Start dozone cluster and check the header of the rpc metrics section on the 
> web ui: http://localhost:9876/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12681) Make HdfsLocatedFileStatus a subtype of LocatedFileStatus

2017-11-22 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262929#comment-16262929
 ] 

Chris Douglas commented on HDFS-12681:
--

bq. I've been doing some stuff with subclasses of FileStatus recently and it's 
not great for subclassing.
Yes, "not great" is a succinct way to put it. I tend to use more profanity.

While a little unusual, the above seemed like the least-convoluted way to 
support all the constraints on the design without bleeding changes throughout 
HDFS. Again, we could extract an interface from {{FileStatus}}, but the type 
system wouldn't be verifying anything relevant.

[~ste...@apache.org]/[~bharatviswa]/[~arpiagariu]/[~djp]/[~elgoiri], do you 
have cycles to review the patch? This is necessary for HDFS-9806, which we'd 
like to merge in the next couple weeks.

> Make HdfsLocatedFileStatus a subtype of LocatedFileStatus
> -
>
> Key: HDFS-12681
> URL: https://issues.apache.org/jira/browse/HDFS-12681
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chris Douglas
>Assignee: Chris Douglas
>Priority: Minor
> Attachments: HDFS-12681.00.patch, HDFS-12681.01.patch, 
> HDFS-12681.02.patch, HDFS-12681.03.patch, HDFS-12681.04.patch, 
> HDFS-12681.05.patch, HDFS-12681.06.patch, HDFS-12681.07.patch, 
> HDFS-12681.08.patch, HDFS-12681.09.patch, HDFS-12681.10.patch, 
> HDFS-12681.11.patch, HDFS-12681.12.patch, HDFS-12681.13.patch
>
>
> {{HdfsLocatedFileStatus}} is a subtype of {{HdfsFileStatus}}, but not of 
> {{LocatedFileStatus}}. Conversion requires copying common fields and shedding 
> unknown data. It would be cleaner and sufficient for {{HdfsFileStatus}} to 
> extend {{LocatedFileStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12851) Ozone: Upgrade to latest ratis build

2017-11-22 Thread Mukul Kumar Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262938#comment-16262938
 ] 

Mukul Kumar Singh commented on HDFS-12851:
--

[~szetszwo] & [~anu] Please have a look.

> Ozone: Upgrade to latest ratis build
> 
>
> Key: HDFS-12851
> URL: https://issues.apache.org/jira/browse/HDFS-12851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12851-HDFS-7240.001.patch, 
> HDFS-12851-HDFS-7240.002.patch
>
>
> Ratis build on Ozone needs to be upgraded to incorporate memory leak 
> fixes(RATIS-135, RATIS-116), state machine optimization(RATIS-122) and Async 
> client api (RATIS-113)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12851) Ozone: Upgrade to latest ratis build

2017-11-22 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-12851:
-
Status: Patch Available  (was: Open)

> Ozone: Upgrade to latest ratis build
> 
>
> Key: HDFS-12851
> URL: https://issues.apache.org/jira/browse/HDFS-12851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12851-HDFS-7240.001.patch, 
> HDFS-12851-HDFS-7240.002.patch
>
>
> Ratis build on Ozone needs to be upgraded to incorporate memory leak 
> fixes(RATIS-135, RATIS-116), state machine optimization(RATIS-122) and Async 
> client api (RATIS-113)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12588) Use GenericOptionsParser for scm and ksm daemon

2017-11-22 Thread Elek, Marton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262937#comment-16262937
 ] 

Elek, Marton commented on HDFS-12588:
-

Patch is rebased on top of HDFS-7240 according to the latest changes.

I also created a follow-up jira for the second comments of [~cheersyang] 
(mapreduce specific parameters should be hidden).

> Use GenericOptionsParser for scm and ksm daemon
> ---
>
> Key: HDFS-12588
> URL: https://issues.apache.org/jira/browse/HDFS-12588
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
> Attachments: HDFS-12588-HDFS-7240.001.patch, 
> HDFS-12588-HDFS-7240.002.patch, HDFS-12588-HDFS-7240.003.patch, 
> HDFS-12588-HDFS-7240.004.patch, HDFS-12588-HDFS-7240.005.patch
>
>
> Most of the hadoop commands use the GenericOptionsParser to use some common 
> CLI arguments (such as -conf or -D or -libjars to define configuration/modify 
> configuration/modify classpath).
> I suggest to use the same common options to scm and ksm daemons as well, as:
> 1. It allows to use the existing cluster management tools/scripts as the 
> daemons could be configured in the same way as namenode and datanode
> 2. It follows the convention from the hadoop common.
> 3. It's easier to develop from the IDE (I start the ksm/scm/datanode/namenode 
> from intellij but I need to add the configuration to the classpath. With 
> -conf I would able to use external configration.)
> I found one problem during the implementation. Until now we used `hdfs scm` 
> command both for the daemon and the scm command line client. If there were no 
> parameters the daemon is started, with parameters the cli is started. The 
> help listed only the damon.
> The -conf (GenericOptionParser) could be used only if we separate the scm and 
> scmcli commands. But any way, it's a more clean and visible if we have 
> separated `hdfs scm` and `hdfs scmcli`.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12851) Ozone: Upgrade to latest ratis build

2017-11-22 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-12851:
-
Attachment: HDFS-12851-HDFS-7240.002.patch

> Ozone: Upgrade to latest ratis build
> 
>
> Key: HDFS-12851
> URL: https://issues.apache.org/jira/browse/HDFS-12851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12851-HDFS-7240.001.patch, 
> HDFS-12851-HDFS-7240.002.patch
>
>
> Ratis build on Ozone needs to be upgraded to incorporate memory leak 
> fixes(RATIS-135, RATIS-116), state machine optimization(RATIS-122) and Async 
> client api (RATIS-113)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12836) startTxId could be greater than endTxId when tailing in-progress edit log

2017-11-22 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262958#comment-16262958
 ] 

Wei-Chiu Chuang commented on HDFS-12836:


Thanks [~csun].
The fix and test looks good to me. Would it be reasonable to add a warning 
message if endTxId is less than remoteLog.getStartTxId()? We assume this only 
happens when there's a config inconsistency in dfs.ha.tail-edits.in-progress, 
but I want to make sure if there is future bug causing this the bug doesn't get 
ignored silently.

> startTxId could be greater than endTxId when tailing in-progress edit log
> -
>
> Key: HDFS-12836
> URL: https://issues.apache.org/jira/browse/HDFS-12836
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HDFS-12836.1.patch
>
>
> When {{dfs.ha.tail-edits.in-progress}} is true, edit log tailer will also 
> tail those in progress edit log segments. However, in the following code:
> {code}
> if (onlyDurableTxns && inProgressOk) {
>   endTxId = Math.min(endTxId, committedTxnId);
> }
> EditLogInputStream elis = EditLogFileInputStream.fromUrl(
> connectionFactory, url, remoteLog.getStartTxId(),
> endTxId, remoteLog.isInProgress());
> {code}
> it is possible that {{remoteLog.getStartTxId()}} could be greater than 
> {{endTxId}}, and therefore will cause the following error:
> {code}
> 2017-11-17 19:55:41,165 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Error replaying edit log at offset 1048576.  Expected transaction ID was 87
> Recent opcode offsets: 1048576
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException:
>  got premature end-of-file at txid 86; expected file to go up to 85
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
> 2017-11-17 19:55:41,165 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Error while reading 
> edits from disk. Will try again.
> org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying 
> edit log at offset 1048576.  Expected transaction ID was 87
> Recent opcode offsets: 1048576
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481)
> at 
> 

[jira] [Commented] (HDFS-12786) Ozone: add port/service names to the ksm/scm web ui

2017-11-22 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262979#comment-16262979
 ] 

Xiaoyu Yao commented on HDFS-12786:
---

+1 for v2 patch. I will commit it shortly.

> Ozone: add port/service names to the ksm/scm web ui
> ---
>
> Key: HDFS-12786
> URL: https://issues.apache.org/jira/browse/HDFS-12786
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Minor
> Attachments: HDFS-12786-HDFS-7240.001.patch, 
> HDFS-12786-HDFS-7240.002.patch
>
>
> Since HDFS-12655 an additional serviceNames field is available for all rpc 
> service via the metrics interface.
> This super small patch modifies to scm/ksm web ui to display this name.
> Instead of
> :9863
> We will display:
> ScmBlockLocationProtocolService (:9863)
> TESTING:
> Start dozone cluster and check the header of the rpc metrics section on the 
> web ui: http://localhost:9876/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12807) Ozone: Expose RockDB stats via JMX for Ozone metadata stores

2017-11-22 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-12807:

Status: In Progress  (was: Patch Available)

Back to in-progress to check the unit tests and checkstyle issues.

> Ozone: Expose RockDB stats via JMX for Ozone metadata stores
> 
>
> Key: HDFS-12807
> URL: https://issues.apache.org/jira/browse/HDFS-12807
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Xiaoyu Yao
>Assignee: Elek, Marton
> Attachments: HDFS-12807-HDFS-7240.001.patch, 
> HDFS-12807-HDFS-7240.002.patch
>
>
> RocksDB JNI has an option to expose stats, this can be further exposed to 
> graphs and monitoring applications. We should expose them to our Rocks 
> metadata store implementation for troubleshooting metadata related 
> performance issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263202#comment-16263202
 ] 

Hudson commented on HDFS-12754:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13271 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13271/])
HDFS-12754. Lease renewal can hit a deadlock. Contributed by Kuhu (kihwal: rev 
738d1a206aba05f0b4be7d633b17db7fcd1c74bc)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/client/impl/TestLeaseRenewer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClientFaultInjector.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/LeaseRenewer.java


> Lease renewal can hit a deadlock 
> -
>
> Key: HDFS-12754
> URL: https://issues.apache.org/jira/browse/HDFS-12754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.1.0
>
> Attachments: HDFS-12754.001.patch, HDFS-12754.002.patch, 
> HDFS-12754.003.patch, HDFS-12754.004.patch, HDFS-12754.005.patch, 
> HDFS-12754.006.patch, HDFS-12754.007.patch, HDFS-12754.008.patch, 
> HDFS-12754.009.patch
>
>
> The Client and the renewer can hit a deadlock during close operation since 
> closeFile() reaches back to the DFSClient#removeFileBeingWritten. This is 
> possible if the client class close when the renewer is renewing a lease.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263201#comment-16263201
 ] 

Kihwal Lee commented on HDFS-12754:
---

[~andrew.wang], I think it is good to add this to 3.0.0. If you don't agree, I 
will just commit to branch-3.0 for 3.0.1.

[~kshukla], branch-2 needs a slight change: make {{testLatch}} final.

> Lease renewal can hit a deadlock 
> -
>
> Key: HDFS-12754
> URL: https://issues.apache.org/jira/browse/HDFS-12754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.1.0
>
> Attachments: HDFS-12754.001.patch, HDFS-12754.002.patch, 
> HDFS-12754.003.patch, HDFS-12754.004.patch, HDFS-12754.005.patch, 
> HDFS-12754.006.patch, HDFS-12754.007.patch, HDFS-12754.008.patch, 
> HDFS-12754.009.patch
>
>
> The Client and the renewer can hit a deadlock during close operation since 
> closeFile() reaches back to the DFSClient#removeFileBeingWritten. This is 
> possible if the client class close when the renewer is renewing a lease.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263074#comment-16263074
 ] 

Hadoop QA commented on HDFS-12754:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m  
2s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
22s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}120m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}194m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12754 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12898875/HDFS-12754.009.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 95345af20e0b 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 782ba3b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 

[jira] [Updated] (HDFS-12847) Regenerate editsStored and editsStored.xml in HDFS tests

2017-11-22 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12847:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks [~xiaochen]. Committed to trunk, branch-3.0 and 3.0.0.

> Regenerate editsStored and editsStored.xml in HDFS tests
> 
>
> Key: HDFS-12847
> URL: https://issues.apache.org/jira/browse/HDFS-12847
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: test
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Fix For: 3.0.0
>
> Attachments: HDFS-12847.00.patch, editsStored
>
>
> From HDFS-12840, we found that the `editsStored` in HDFS tests missing a few 
> operations, i.e., the following operations from 
> {{DFSTestUtils#runOperations()}}.
> {code}
>  // OP_UPDATE_BLOCKS 25
> final String updateBlockFile = "/update_blocks";
> FSDataOutputStream fout = filesystem.create(new Path(updateBlockFile), 
> true, 4096, (short)1, 4096L);
> fout.write(1);
> fout.hflush();
> long fileId = ((DFSOutputStream)fout.getWrappedStream()).getFileId();
> DFSClient dfsclient = DFSClientAdapter.getDFSClient(filesystem);
> LocatedBlocks blocks = 
> dfsclient.getNamenode().getBlockLocations(updateBlockFile, 0, 
> Integer.MAX_VALUE);
> dfsclient.getNamenode().abandonBlock(blocks.get(0).getBlock(), fileId, 
> updateBlockFile, dfsclient.clientName);
> fout.close();
> {code}
> We should re-generate to edits and related XML to sync with the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12851) Ozone: Upgrade to latest ratis build

2017-11-22 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263103#comment-16263103
 ] 

Tsz Wo Nicholas Sze commented on HDFS-12851:


Thanks, Mukul.  Patch looks good.  Some minor comments:
- Set rpc type in RatisHelper.newRaftClient
{code}
RaftConfigKeys.Rpc.setType(properties, rpcType);
{code}
- In order to avoid hard coding the key names, use RaftServerConfigKeys and 
GrpcConfigKeys in XceiverServerRatis.newRaftProperties as above.
- Rename startNanos to startMillis since Time.monotonicNow() returns 
milliseconds.


> Ozone: Upgrade to latest ratis build
> 
>
> Key: HDFS-12851
> URL: https://issues.apache.org/jira/browse/HDFS-12851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12851-HDFS-7240.001.patch, 
> HDFS-12851-HDFS-7240.002.patch
>
>
> Ratis build on Ozone needs to be upgraded to incorporate memory leak 
> fixes(RATIS-135, RATIS-116), state machine optimization(RATIS-122) and Async 
> client api (RATIS-113)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263121#comment-16263121
 ] 

Kihwal Lee commented on HDFS-12754:
---

+1 looks good. 

> Lease renewal can hit a deadlock 
> -
>
> Key: HDFS-12754
> URL: https://issues.apache.org/jira/browse/HDFS-12754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-12754.001.patch, HDFS-12754.002.patch, 
> HDFS-12754.003.patch, HDFS-12754.004.patch, HDFS-12754.005.patch, 
> HDFS-12754.006.patch, HDFS-12754.007.patch, HDFS-12754.008.patch, 
> HDFS-12754.009.patch
>
>
> The Client and the renewer can hit a deadlock during close operation since 
> closeFile() reaches back to the DFSClient#removeFileBeingWritten. This is 
> possible if the client class close when the renewer is renewing a lease.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12847) Regenerate editsStored and editsStored.xml in HDFS tests

2017-11-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263131#comment-16263131
 ] 

Hudson commented on HDFS-12847:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13270 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13270/])
HDFS-12847. Regenerate editsStored and editsStored.xml in HDFS tests. (lei: rev 
785732c13e2ebe9f27350b6be82eb2fb782d7dc4)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored.xml
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored


> Regenerate editsStored and editsStored.xml in HDFS tests
> 
>
> Key: HDFS-12847
> URL: https://issues.apache.org/jira/browse/HDFS-12847
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: test
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Fix For: 3.0.0
>
> Attachments: HDFS-12847.00.patch, editsStored
>
>
> From HDFS-12840, we found that the `editsStored` in HDFS tests missing a few 
> operations, i.e., the following operations from 
> {{DFSTestUtils#runOperations()}}.
> {code}
>  // OP_UPDATE_BLOCKS 25
> final String updateBlockFile = "/update_blocks";
> FSDataOutputStream fout = filesystem.create(new Path(updateBlockFile), 
> true, 4096, (short)1, 4096L);
> fout.write(1);
> fout.hflush();
> long fileId = ((DFSOutputStream)fout.getWrappedStream()).getFileId();
> DFSClient dfsclient = DFSClientAdapter.getDFSClient(filesystem);
> LocatedBlocks blocks = 
> dfsclient.getNamenode().getBlockLocations(updateBlockFile, 0, 
> Integer.MAX_VALUE);
> dfsclient.getNamenode().abandonBlock(blocks.get(0).getBlock(), fileId, 
> updateBlockFile, dfsclient.clientName);
> fout.close();
> {code}
> We should re-generate to edits and related XML to sync with the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-12754:
--
Fix Version/s: 3.1.0

> Lease renewal can hit a deadlock 
> -
>
> Key: HDFS-12754
> URL: https://issues.apache.org/jira/browse/HDFS-12754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.1.0
>
> Attachments: HDFS-12754.001.patch, HDFS-12754.002.patch, 
> HDFS-12754.003.patch, HDFS-12754.004.patch, HDFS-12754.005.patch, 
> HDFS-12754.006.patch, HDFS-12754.007.patch, HDFS-12754.008.patch, 
> HDFS-12754.009.patch
>
>
> The Client and the renewer can hit a deadlock during close operation since 
> closeFile() reaches back to the DFSClient#removeFileBeingWritten. This is 
> possible if the client class close when the renewer is renewing a lease.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12588) Use GenericOptionsParser for scm and ksm daemon

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263346#comment-16263346
 ] 

Hadoop QA commented on HDFS-12588:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
46s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
23s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
11s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}170m 50s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}223m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestFileAppend2 |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 |
|   | hadoop.ozone.ozShell.TestOzoneShell |
|   | hadoop.fs.TestUnbuffer |
|   | hadoop.ozone.container.common.impl.TestContainerPersistence |
|   | hadoop.hdfs.server.namenode.TestBlockPlacementPolicyRackFaultTolerant |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.ozone.web.client.TestKeys |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshot |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | 

[jira] [Commented] (HDFS-12851) Ozone: Upgrade to latest ratis build

2017-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263365#comment-16263365
 ] 

Hadoop QA commented on HDFS-12851:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} HDFS-7240 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
42s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
10s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
49s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
8s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
41s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} HDFS-7240 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
45s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 53s{color} | {color:orange} root: The patch generated 2 new + 0 unchanged - 
0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
16s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
46s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}144m 51s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
38s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}239m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 

[jira] [Updated] (HDFS-12588) Use GenericOptionsParser for scm and ksm daemon

2017-11-22 Thread Elek, Marton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDFS-12588:

Attachment: HDFS-12588-HDFS-7240.005.patch

> Use GenericOptionsParser for scm and ksm daemon
> ---
>
> Key: HDFS-12588
> URL: https://issues.apache.org/jira/browse/HDFS-12588
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7240
>Reporter: Elek, Marton
>Assignee: Elek, Marton
> Attachments: HDFS-12588-HDFS-7240.001.patch, 
> HDFS-12588-HDFS-7240.002.patch, HDFS-12588-HDFS-7240.003.patch, 
> HDFS-12588-HDFS-7240.004.patch, HDFS-12588-HDFS-7240.005.patch
>
>
> Most of the hadoop commands use the GenericOptionsParser to use some common 
> CLI arguments (such as -conf or -D or -libjars to define configuration/modify 
> configuration/modify classpath).
> I suggest to use the same common options to scm and ksm daemons as well, as:
> 1. It allows to use the existing cluster management tools/scripts as the 
> daemons could be configured in the same way as namenode and datanode
> 2. It follows the convention from the hadoop common.
> 3. It's easier to develop from the IDE (I start the ksm/scm/datanode/namenode 
> from intellij but I need to add the configuration to the classpath. With 
> -conf I would able to use external configration.)
> I found one problem during the implementation. Until now we used `hdfs scm` 
> command both for the daemon and the scm command line client. If there were no 
> parameters the daemon is started, with parameters the cli is started. The 
> help listed only the damon.
> The -conf (GenericOptionParser) could be used only if we separate the scm and 
> scmcli commands. But any way, it's a more clean and visible if we have 
> separated `hdfs scm` and `hdfs scmcli`.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)

2017-11-22 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-12665:
--
Attachment: HDFS-12665-HDFS-9806.008.patch

Posting a new patch that fixes the findbugs, checkstyle issues and failing test 
({{TestProvidedImpl}}) from the earlier jenkins run, and rebases the changes on 
the latest version of HDFS-9806 branch.

> [AliasMap] Create a version of the AliasMap that runs in memory in the 
> Namenode (leveldb)
> -
>
> Key: HDFS-12665
> URL: https://issues.apache.org/jira/browse/HDFS-12665
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-12665-HDFS-9806.001.patch, 
> HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, 
> HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, 
> HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, 
> HDFS-12665-HDFS-9806.008.patch
>
>
> The design of Provided Storage requires the use of an AliasMap to manage the 
> mapping between blocks of files on the local HDFS and ranges of files on a 
> remote storage system. To reduce load from the Namenode, this can be done 
> using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). 
> However, to aide adoption and ease of deployment, we propose an in memory 
> version.
> This AliasMap will be a wrapper around LevelDB (already a dependency from the 
> Timeline Service) and use protobuf for the key (blockpool, blockid, and 
> genstamp) and the value (url, offset, length, nonce). The in memory service 
> will also have a configurable port on which it will listen for updates from 
> Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12836) startTxId could be greater than endTxId when tailing in-progress edit log

2017-11-22 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-12836:

Status: Patch Available  (was: Open)

> startTxId could be greater than endTxId when tailing in-progress edit log
> -
>
> Key: HDFS-12836
> URL: https://issues.apache.org/jira/browse/HDFS-12836
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HDFS-12836.1.patch, HDFS-12836.2.patch
>
>
> When {{dfs.ha.tail-edits.in-progress}} is true, edit log tailer will also 
> tail those in progress edit log segments. However, in the following code:
> {code}
> if (onlyDurableTxns && inProgressOk) {
>   endTxId = Math.min(endTxId, committedTxnId);
> }
> EditLogInputStream elis = EditLogFileInputStream.fromUrl(
> connectionFactory, url, remoteLog.getStartTxId(),
> endTxId, remoteLog.isInProgress());
> {code}
> it is possible that {{remoteLog.getStartTxId()}} could be greater than 
> {{endTxId}}, and therefore will cause the following error:
> {code}
> 2017-11-17 19:55:41,165 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Error replaying edit log at offset 1048576.  Expected transaction ID was 87
> Recent opcode offsets: 1048576
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException:
>  got premature end-of-file at txid 86; expected file to go up to 85
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
> 2017-11-17 19:55:41,165 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Error while reading 
> edits from disk. Will try again.
> org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying 
> edit log at offset 1048576.  Expected transaction ID was 87
> Recent opcode offsets: 1048576
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
> Caused by: 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException:
>  got premature end-of-file at txid 86; expected file to go up to 85
> at 
> 

[jira] [Updated] (HDFS-12836) startTxId could be greater than endTxId when tailing in-progress edit log

2017-11-22 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-12836:

Attachment: HDFS-12836.2.patch

Thanks [~jojochuang]. Yes I think this is a good suggestion. Uploaded patch v2.

> startTxId could be greater than endTxId when tailing in-progress edit log
> -
>
> Key: HDFS-12836
> URL: https://issues.apache.org/jira/browse/HDFS-12836
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.0-alpha1
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HDFS-12836.1.patch, HDFS-12836.2.patch
>
>
> When {{dfs.ha.tail-edits.in-progress}} is true, edit log tailer will also 
> tail those in progress edit log segments. However, in the following code:
> {code}
> if (onlyDurableTxns && inProgressOk) {
>   endTxId = Math.min(endTxId, committedTxnId);
> }
> EditLogInputStream elis = EditLogFileInputStream.fromUrl(
> connectionFactory, url, remoteLog.getStartTxId(),
> endTxId, remoteLog.isInProgress());
> {code}
> it is possible that {{remoteLog.getStartTxId()}} could be greater than 
> {{endTxId}}, and therefore will cause the following error:
> {code}
> 2017-11-17 19:55:41,165 ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Error replaying edit log at offset 1048576.  Expected transaction ID was 87
> Recent opcode offsets: 1048576
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException:
>  got premature end-of-file at txid 86; expected file to go up to 85
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:189)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
> 2017-11-17 19:55:41,165 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Error while reading 
> edits from disk. Will try again.
> org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying 
> edit log at offset 1048576.  Expected transaction ID was 87
> Recent opcode offsets: 1048576
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:218)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
> Caused by: 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException:
>  got premature end-of-file at txid 86; expected file to go up to 85
> at 
> 

[jira] [Created] (HDFS-12855) Fsck violates namesystem locking

2017-11-22 Thread Konstantin Shvachko (JIRA)
Konstantin Shvachko created HDFS-12855:
--

 Summary: Fsck violates namesystem locking 
 Key: HDFS-12855
 URL: https://issues.apache.org/jira/browse/HDFS-12855
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.4
Reporter: Konstantin Shvachko


{{NamenodeFsck}} access {{FSNamesystem}} structures, such as INodes, BlockInfo 
without holding a lock. See e.g. {{NamenodeFsck.blockIdCK()}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9146) HDFS forward seek() within a block shouldn't spawn new TCP Peer/RemoteBlockReader

2017-11-22 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HDFS-9146:
--
Affects Version/s: 3.0.0

> HDFS forward seek() within a block shouldn't spawn new TCP 
> Peer/RemoteBlockReader
> -
>
> Key: HDFS-9146
> URL: https://issues.apache.org/jira/browse/HDFS-9146
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.8.0, 2.7.1, 3.0.0
>Reporter: Gopal V
>
> When a seek() + forward readFully() is triggered from a remote dfsclient, 
> HDFS opens a new remote block reader even if the seek is within the same HDFS 
> block.
> (analysis from [~rajesh.balamohan])
> This is due to the fact that a simple read operation assumes that the user is 
> going to read till the end of the block.
> {code}
>   try {
> blockReader = getBlockReader(targetBlock, offsetIntoBlock,
> targetBlock.getBlockSize() - offsetIntoBlock, targetAddr,
> storageType, chosenNode);
> {code}
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java#L624
> Since the user hasn't read till the end of the block when the next seek 
> happens, the BlockReader assumes this is an aborted read and tries to throw 
> away the TCP peer it has got.
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java#L324
> {code}
> // If we've now satisfied the whole client read, read one last packet
> // header, which should be empty
> if (bytesNeededToFinish <= 0) {
>   readTrailingEmptyPacket(); 
>  ...
>   sendReadResult(Status.SUCCESS);
> {code}
> Since that is not satisfied, the status code is unset & the peer is not 
> returned to the cache.
> {code}
> if (peerCache != null && sentStatusCode) {
>   peerCache.put(datanodeID, peer);
> } else {
>   peer.close();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12809) [READ] Fix the randomized selection of locations in {{ProvidedBlocksBuilder}}.

2017-11-22 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263324#comment-16263324
 ] 

Virajith Jalaparti commented on HDFS-12809:
---

[~elgoiri], the existing tests ensure that there are enough locations but don't 
really make sure that the distribution is random.

> [READ] Fix the randomized selection of locations in {{ProvidedBlocksBuilder}}.
> --
>
> Key: HDFS-12809
> URL: https://issues.apache.org/jira/browse/HDFS-12809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-12809-HDFS-9806.001.patch
>
>
> Calling {{getBlockLocations}} on files that have a PROVIDED replica, results 
> in the datanode locations being selected at random. Currently, this 
> randomization uses the datanode uuids to pick a node at random 
> ({{ProvidedDescriptor#choose}}, {{ProvidedDescriptor#chooseRandom}}). 
> Depending on the distribution of the datanode UUIDs, this can lead to large 
> number of iterations (which may not terminate) before a location is chosen. 
> This JIRA aims to replace this with a more efficient randomization strategy.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)

2017-11-22 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-12665:
--
Status: Open  (was: Patch Available)

> [AliasMap] Create a version of the AliasMap that runs in memory in the 
> Namenode (leveldb)
> -
>
> Key: HDFS-12665
> URL: https://issues.apache.org/jira/browse/HDFS-12665
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-12665-HDFS-9806.001.patch, 
> HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, 
> HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, 
> HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, 
> HDFS-12665-HDFS-9806.008.patch
>
>
> The design of Provided Storage requires the use of an AliasMap to manage the 
> mapping between blocks of files on the local HDFS and ranges of files on a 
> remote storage system. To reduce load from the Namenode, this can be done 
> using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). 
> However, to aide adoption and ease of deployment, we propose an in memory 
> version.
> This AliasMap will be a wrapper around LevelDB (already a dependency from the 
> Timeline Service) and use protobuf for the key (blockpool, blockid, and 
> genstamp) and the value (url, offset, length, nonce). The in memory service 
> will also have a configurable port on which it will listen for updates from 
> Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12665) [AliasMap] Create a version of the AliasMap that runs in memory in the Namenode (leveldb)

2017-11-22 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-12665:
--
Status: Patch Available  (was: Open)

> [AliasMap] Create a version of the AliasMap that runs in memory in the 
> Namenode (leveldb)
> -
>
> Key: HDFS-12665
> URL: https://issues.apache.org/jira/browse/HDFS-12665
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-12665-HDFS-9806.001.patch, 
> HDFS-12665-HDFS-9806.002.patch, HDFS-12665-HDFS-9806.003.patch, 
> HDFS-12665-HDFS-9806.004.patch, HDFS-12665-HDFS-9806.005.patch, 
> HDFS-12665-HDFS-9806.006.patch, HDFS-12665-HDFS-9806.007.patch, 
> HDFS-12665-HDFS-9806.008.patch
>
>
> The design of Provided Storage requires the use of an AliasMap to manage the 
> mapping between blocks of files on the local HDFS and ranges of files on a 
> remote storage system. To reduce load from the Namenode, this can be done 
> using a pluggable external service (e.g. AzureTable, Cassandra, Ratis). 
> However, to aide adoption and ease of deployment, we propose an in memory 
> version.
> This AliasMap will be a wrapper around LevelDB (already a dependency from the 
> Timeline Service) and use protobuf for the key (blockpool, blockid, and 
> genstamp) and the value (url, offset, length, nonce). The in memory service 
> will also have a configurable port on which it will listen for updates from 
> Storage Policy Satisfier (SPS) Coordinating Datanodes (C-DN).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263261#comment-16263261
 ] 

Andrew Wang commented on HDFS-12754:


Agree, please get it into branch-3.0.0 as well since we're rolling a new RC.

> Lease renewal can hit a deadlock 
> -
>
> Key: HDFS-12754
> URL: https://issues.apache.org/jira/browse/HDFS-12754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.1.0
>
> Attachments: HDFS-12754.001.patch, HDFS-12754.002.patch, 
> HDFS-12754.003.patch, HDFS-12754.004.patch, HDFS-12754.005.patch, 
> HDFS-12754.006.patch, HDFS-12754.007.patch, HDFS-12754.008.patch, 
> HDFS-12754.009.patch
>
>
> The Client and the renewer can hit a deadlock during close operation since 
> closeFile() reaches back to the DFSClient#removeFileBeingWritten. This is 
> possible if the client class close when the renewer is renewing a lease.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263282#comment-16263282
 ] 

Konstantin Shvachko edited comment on HDFS-12832 at 11/22/17 8:18 PM:
--

The 002 patch fixes the issue of calling {{getFullPathName()}} without holding 
the lock. The fix is to compute {{getFullPathName()}} inside  
{{BlockReconstructionWork}} constructor, which is called under the lock. Just 
as explained earlier.
I did not write a unit test as we would have to perform a major surgery on 
FSNamesystem in order to get exactly that {{ArrayIndexOutOfBoundsException}}.
Instead I looked through all calls of {{getFullPathName()}}, making sure it is 
done under a lock. Found one place where it is not - in Fsck. Will file a jira 
for that shortly.


was (Author: shv):
The 002 patch fixes the issue of calling {{getFullPathName()}} without holding 
the lock. The fix is to compute {{getFullPathName()}} inside  
{{BlockReconstructionWork}} constructor, which is called under the lock. Just 
as explained earlier.
I did not write a unit as we would have to perform a major surgery on 
FSNamesystem in order to get exacvtly that {{ArrayIndexOutOfBoundsException}}.
Instead I loo0ked through all calls of {{getFullPathName()}}, making sure it is 
done under a lock. Found one place where it is not - in Fsck. Will file a jira 
for that shortly.

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-trunk-001.patch, HDFS-12832.002.patch, 
> exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-12832:
---
Attachment: HDFS-12832.002.patch

The 002 patch fixes the issue of calling {{getFullPathName()}} without holding 
the lock. The fix is to compute {{getFullPathName()}} inside  
{{BlockReconstructionWork}} constructor, which is called under the lock. Just 
as explained earlier.
I did not write a unit as we would have to perform a major surgery on 
FSNamesystem in order to get exacvtly that {{ArrayIndexOutOfBoundsException}}.
Instead I loo0ked through all calls of {{getFullPathName()}}, making sure it is 
done under a lock. Found one place where it is not - in Fsck. Will file a jira 
for that shortly.

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-trunk-001.patch, HDFS-12832.002.patch, 
> exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12754) Lease renewal can hit a deadlock

2017-11-22 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-12754:
--
Fix Version/s: 3.0.0

> Lease renewal can hit a deadlock 
> -
>
> Key: HDFS-12754
> URL: https://issues.apache.org/jira/browse/HDFS-12754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HDFS-12754.001.patch, HDFS-12754.002.patch, 
> HDFS-12754.003.patch, HDFS-12754.004.patch, HDFS-12754.005.patch, 
> HDFS-12754.006.patch, HDFS-12754.007.patch, HDFS-12754.008.patch, 
> HDFS-12754.009.patch
>
>
> The Client and the renewer can hit a deadlock during close operation since 
> closeFile() reaches back to the DFSClient#removeFileBeingWritten. This is 
> possible if the client class close when the renewer is renewing a lease.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-12832:
---
Status: Patch Available  (was: Open)

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0-beta1, 2.7.4
>Reporter: DENG FEI
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-trunk-001.patch, HDFS-12832.002.patch, 
> exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263296#comment-16263296
 ] 

Konstantin Shvachko commented on HDFS-12832:


Filed HDFS-12855.

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-trunk-001.patch, HDFS-12832.002.patch, 
> exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko reassigned HDFS-12832:
--

Assignee: Konstantin Shvachko

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Assignee: Konstantin Shvachko
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-trunk-001.patch, HDFS-12832.002.patch, 
> exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-12832:
---
Attachment: HDFS-12832-branch-2.002.patch

Branch-2 patch is different from trunk and 3.0, but works for 2.8.

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Assignee: Konstantin Shvachko
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-branch-2.002.patch, 
> HDFS-12832-trunk-001.patch, HDFS-12832.002.patch, exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12832) INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit

2017-11-22 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-12832:
---
Attachment: HDFS-12832-branch-2.7.002.patch

branch-2.7 patch. A bit more changes as I reorganized members of 
{{ReplicationWork}} to match branch-2.

> INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to 
> NameNode exit
> 
>
> Key: HDFS-12832
> URL: https://issues.apache.org/jira/browse/HDFS-12832
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.4, 3.0.0-beta1
>Reporter: DENG FEI
>Assignee: Konstantin Shvachko
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-12832-branch-2.002.patch, 
> HDFS-12832-branch-2.7.002.patch, HDFS-12832-trunk-001.patch, 
> HDFS-12832.002.patch, exception.log
>
>
> {code:title=INode.java|borderStyle=solid}
> public String getFullPathName() {
> // Get the full path name of this inode.
> if (isRoot()) {
>   return Path.SEPARATOR;
> }
> // compute size of needed bytes for the path
> int idx = 0;
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   // add component + delimiter (if not tail component)
>   idx += inode.getLocalNameBytes().length + (inode != this ? 1 : 0);
> }
> byte[] path = new byte[idx];
> for (INode inode = this; inode != null; inode = inode.getParent()) {
>   if (inode != this) {
> path[--idx] = Path.SEPARATOR_CHAR;
>   }
>   byte[] name = inode.getLocalNameBytes();
>   idx -= name.length;
>   System.arraycopy(name, 0, path, idx, name.length);
> }
> return DFSUtil.bytes2String(path);
>   }
> {code}
> We found ArrayIndexOutOfBoundsException at 
> _{color:#707070}System.arraycopy(name, 0, path, idx, name.length){color}_ 
> when ReplicaMonitor work ,and the NameNode will quit.
> It seems the two loop is not synchronized, the path's length is changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org