[jira] [Commented] (HDFS-9063) Correctly handle snapshot path for getContentSummary

2015-09-17 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791678#comment-14791678
 ] 

Yi Liu commented on HDFS-9063:
--

Thanks [~jingzhao] for working on this.  Jenkins' report has issue, 
{{TestGetContentSummaryWithSnapshot}} can pass locally.

I also found similar issue (not the same, but also problem of getContentSummary 
if there is snapshot) when writing tests for large directory in HDFS-9053, it 
also exists in current trunk.  I apply your patch, the issue I saw is still 
there.
I think you can fix the issue too and write test of following steps to 
reproduce the issue I found (Of course if you don't want to fix it, I can do it 
separately :)):
# Suppose we have a directory named 'dir', create 16 files in the dir
# remove the last 1 file  -- now total 15 files in dir
# create a snapshot 's1' of dir
# add 1 file in dir -- now total 16 files in dir
# remove the first 1 file in dir -- now total 15 files in the dir
# call getContentSummary(dir), and then {{getFileCount}}.  -- the expected 
result is 15, but the return is 16.

> Correctly handle snapshot path for getContentSummary
> 
>
> Key: HDFS-9063
> URL: https://issues.apache.org/jira/browse/HDFS-9063
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9063.000.patch
>
>
> The current getContentSummary implementation does not take into account the 
> snapshot path, thus if we have the following ops:
> 1. create dirs /foo/bar
> 2. take snapshot s1 on /foo
> 3. create a 1 byte file /foo/bar/baz
> then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", 
> which is incorrect since the 1 byte file is not included in snapshot s1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"

2015-09-17 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-9089:
--
Hadoop Flags: Incompatible change

This looks like an incompatible change.



> Balancer and Mover should use ".system" as reserved inode name instead of 
> "system"
> --
>
> Key: HDFS-9089
> URL: https://issues.apache.org/jira/browse/HDFS-9089
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch
>
>
> Currently Balancer and Mover create "/system" for placing mover.id and 
> balancer.id
> hdfs dfs -ls /
> drwxr-xr-x   - root hadoop  0 2015-09-16 12:49 
> {color:red}/system{color}
> This folder created in not deleted once mover or balancer work is completed 
> So user cannot create dir "system" .
> Its better to make ".system" as reserved inode for balancer and mover instead 
> of "system".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9040) Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests to Coordinator)

2015-09-17 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791655#comment-14791655
 ] 

Zhe Zhang commented on HDFS-9040:
-

Thanks Jing for the new patch! The structure looks much cleaner now.

I've been thinking about the design to check streamer failures at 
{{writeChunk}} and other events on {{OutputStream}} level. The code structure 
is certainly simpler than handling failures on streamer level. But are there 
any disadvantages to delay the handling of a streamer failure? If there isn't 
any downside, should we just do {{updatePipeline}} when completing the block?

A few possible disadvantages I can think of:
# In the read-being-written scenario, there will be a longer window of 
*false-fresh" (meaning a stale internal block is considered as fresh). 
# When {{NUM_PARITY_BLOCKS}} number of streamers are dead, the {{OutputStream}} 
should die immediately instead of waiting for the next {{writeChunk}}. 
# We might want to add the logic to replace a failed {{StripedDataStreamer}} in 
the future. Delayed error handling will cause delayed streamer replacement.

> Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests 
> to Coordinator)
> ---
>
> Key: HDFS-9040
> URL: https://issues.apache.org/jira/browse/HDFS-9040
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
> Attachments: HDFS-9040-HDFS-7285.002.patch, 
> HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, 
> HDFS-9040.02.bgstreamer.patch
>
>
> The general idea is to simplify error handling logic.
> Proposal 1:
> A BlockGroupDataStreamer to communicate with NN to allocate/update block, and 
> StripedDataStreamer s only have to stream blocks to DNs.
> Proposal 2:
> See below the 
> [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388]
>  from [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client

2015-09-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791677#comment-14791677
 ] 

Mingliang Liu commented on HDFS-9022:
-

The only failing test {{TestReplaceDatanodeOnFailure}} fails occasionally as a 
known bug, see [HDFS-6101]. Timeout tests can not reproduced in my local Mac.


New javac warning in TestMRCredentials.java:
{quote}getUri(InetSocketAddress) in NameNode has been deprecated{quote}
This is expected as we will file a new jira to replace the 
{{NameNode.getUri()}} with {{DFSUtilClient.getNNUri()}}. See [comments above | 
https://issues.apache.org/jira/browse/HDFS-9022?focusedCommentId=14791104=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14791104]

> Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
> --
>
> Key: HDFS-9022
> URL: https://issues.apache.org/jira/browse/HDFS-9022
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client, namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, 
> HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch
>
>
> The static helper methods in NameNodes are used in {{hdfs-client}} module. 
> For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes 
> which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should 
> keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module.
> This jira tracks the effort of moving the following static helper methods out 
> of  {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these 
> methods is the {{DFSUtilClient}} class:
> {code}
> public static InetSocketAddress getAddress(String address);
> public static InetSocketAddress getAddress(Configuration conf);
> public static InetSocketAddress getAddress(URI filesystemURI);
> public static URI getUri(InetSocketAddress namenode);
> {code}
> Be cautious not to bring new checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning external

2015-09-17 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-8341:
-
Assignee: (was: Surendra Singh Lilhore)

> (Summary & Description may be invalid) HDFS mover stuck in loop after failing 
> to move block, doesn't move rest of blocks, can't get data back off 
> decommissioning external storage tier as a result
> ---
>
> Key: HDFS-8341
> URL: https://issues.apache.org/jira/browse/HDFS-8341
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
>
> HDFS mover gets stuck looping on a block that fails to move and doesn't 
> migrate the rest of the blocks.
> This is preventing recovery of data from a decomissioning external storage 
> tier used for archive (we've had problems with that proprietary "hyperscale" 
> storage product which is why a couple blocks here and there have checksum 
> problems or premature eof as shown below), but this should not prevent moving 
> all the other blocks to recover our data:
> {code}hdfs mover -p /apps/hive/warehouse/
> 15/05/07 14:52:50 INFO mover.Mover: namenodes = 
> {hdfs://nameservice1=[/apps/hive/warehouse/]}
> 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> 
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9092) Nfs silently drops overlapping write requests, thus data copying can't complete

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791684#comment-14791684
 ] 

Hadoop QA commented on HDFS-9092:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 21s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  1s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 58s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 15s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 22s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 18  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   0m 53s | The patch appears to introduce 2 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |   1m 46s | Tests passed in 
hadoop-hdfs-nfs. |
| | |  43m 23s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-nfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756410/HDFS-9092.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0832b38 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12502/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12502/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html
 |
| hadoop-hdfs-nfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12502/artifact/patchprocess/testrun_hadoop-hdfs-nfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12502/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12502/console |


This message was automatically generated.

> Nfs silently drops overlapping write requests, thus data copying can't 
> complete
> ---
>
> Key: HDFS-9092
> URL: https://issues.apache.org/jira/browse/HDFS-9092
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.7.1
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9092.001.patch
>
>
> When NOT using 'sync' option, the NFS writes may issue the following warning:
> org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write 
> (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now
> and the size of data copied via NFS will stay at 1248752400.
> Found what happened is:
> 1. The write requests from client are sent asynchronously. 
> 2. The NFS gateway has handler to handle the incoming requests by creating an 
> internal write request structuire and put it into cache;
> 3. In parallel, a separate thread in NFS gateway takes requests out from the 
> cache and writes the data to HDFS.
> The current offset is how much data has been written by the write thread in 
> 3. The detection of overlapping write request happens in 2, but it only 
> checks the write request against the curent offset, and trim the request if 
> necessary. Because the write requests are sent asynchronously, if two 
> requests are beyond the current offset, and they overlap, it's not detected 
> and both are put into the cache. This cause the symptom reported in this case 
> at step 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791692#comment-14791692
 ] 

Hadoop QA commented on HDFS-9089:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 23s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 58s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 22s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 22s | The applied patch generated  1 
new checkstyle issues (total was 59, now 60). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m  6s | Post-patch findbugs 
hadoop-hdfs-project/hadoop-hdfs compilation is broken. |
| {color:green}+1{color} | findbugs |   2m  6s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   0m 24s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |   0m 21s | Tests failed in hadoop-hdfs. |
| | |  43m 27s | |
\\
\\
|| Reason || Tests ||
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756411/HDFS-9089.02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0832b38 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12504/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12504/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12504/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12504/console |


This message was automatically generated.

> Balancer and Mover should use ".system" as reserved inode name instead of 
> "system"
> --
>
> Key: HDFS-9089
> URL: https://issues.apache.org/jira/browse/HDFS-9089
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch
>
>
> Currently Balancer and Mover create "/system" for placing mover.id and 
> balancer.id
> hdfs dfs -ls /
> drwxr-xr-x   - root hadoop  0 2015-09-16 12:49 
> {color:red}/system{color}
> This folder created in not deleted once mover or balancer work is completed 
> So user cannot create dir "system" .
> Its better to make ".system" as reserved inode for balancer and mover instead 
> of "system".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9040) Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests to Coordinator)

2015-09-17 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791739#comment-14791739
 ] 

Walter Su commented on HDFS-9040:
-

bq. should we just do updatePipeline when completing the block? 1. In the 
read-being-written scenario, there will be a longer window of *false-fresh" 
(meaning a stale internal block is considered as fresh).
We should do it before hflush/hsync as well.

bq. 2. When NUM_PARITY_BLOCKS number of streamers are dead, the OutputStream 
should die immediately instead of waiting for the next writeChunk.
failed streamer is detected in writeChunk. We plan to add periodical checking. 
[~jingzhao] said that before. 

bq. 3. We might want to add the logic to replace a failed StripedDataStreamer 
in the future.
No, we won't. I think so? if you're talking something like Datanode replacement 
for repl block. You can transfer a healthy repl RBW to a new Datanode, then you 
still get 3 DNs after replacement. But recover a corrupted RBW internal block 
is difficult.

I've a question. Instead of delay, Do we even need refresh UC.replicas? 
1. A client read UC block being written can decode replica if it misses some 
part. ( With checksum verification, we are only concern about 'missing')
2. Block recovery/ lease recovery truncates all RBW's length to minimal length 
for repl block. For striping, Assume a corrupted internalBlock has a small 
length ,like 200kb. 8 healthy internalBlocks have long length, like 
(1mb-cellSize, 1mb+cellSize). Of course after recovery we should truncate the 8 
to 1mb ( 8 healthy internal blocks should be at the same last stripe, but 
should we truncate last stripe? That's not my point.). My point is , we can 
rule out the corrupted internalBlocks by {{commitBlockSynchronization}}.
3. Maintenance the indices of UC.replicas. UC.replicas updated by BlockReport 
is safe, because reportedBlock has ID. If UC.replicas is updated by 
updatePipeline, the indices are derived from array offset. You can see 
{{UC.setExpectedLocations()}} It's error prone. If we don't refresh UC.replicas 
we are pretty safe.

> Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests 
> to Coordinator)
> ---
>
> Key: HDFS-9040
> URL: https://issues.apache.org/jira/browse/HDFS-9040
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
> Attachments: HDFS-9040-HDFS-7285.002.patch, 
> HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, 
> HDFS-9040.02.bgstreamer.patch
>
>
> The general idea is to simplify error handling logic.
> Proposal 1:
> A BlockGroupDataStreamer to communicate with NN to allocate/update block, and 
> StripedDataStreamer s only have to stream blocks to DNs.
> Proposal 2:
> See below the 
> [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388]
>  from [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"

2015-09-17 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791760#comment-14791760
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9089:
---

> Its better to make ".system" as reserved inode for balancer and mover instead 
> of "system".

Why ".system" is better than "system"?  The same argument applies -- what if 
users want to create ".system"?

> Balancer and Mover should use ".system" as reserved inode name instead of 
> "system"
> --
>
> Key: HDFS-9089
> URL: https://issues.apache.org/jira/browse/HDFS-9089
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch
>
>
> Currently Balancer and Mover create "/system" for placing mover.id and 
> balancer.id
> hdfs dfs -ls /
> drwxr-xr-x   - root hadoop  0 2015-09-16 12:49 
> {color:red}/system{color}
> This folder created in not deleted once mover or balancer work is completed 
> So user cannot create dir "system" .
> Its better to make ".system" as reserved inode for balancer and mover instead 
> of "system".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning external

2015-09-17 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze resolved HDFS-8341.
---
Resolution: Invalid

Resolving as invalid.  Please feel free to reopen if you disagree.

> (Summary & Description may be invalid) HDFS mover stuck in loop after failing 
> to move block, doesn't move rest of blocks, can't get data back off 
> decommissioning external storage tier as a result
> ---
>
> Key: HDFS-8341
> URL: https://issues.apache.org/jira/browse/HDFS-8341
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
>
> HDFS mover gets stuck looping on a block that fails to move and doesn't 
> migrate the rest of the blocks.
> This is preventing recovery of data from a decomissioning external storage 
> tier used for archive (we've had problems with that proprietary "hyperscale" 
> storage product which is why a couple blocks here and there have checksum 
> problems or premature eof as shown below), but this should not prevent moving 
> all the other blocks to recover our data:
> {code}hdfs mover -p /apps/hive/warehouse/
> 15/05/07 14:52:50 INFO mover.Mover: namenodes = 
> {hdfs://nameservice1=[/apps/hive/warehouse/]}
> 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> 
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8550) Erasure Coding: Fix FindBugs Multithreaded correctness Warning

2015-09-17 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802814#comment-14802814
 ] 

Rakesh R commented on HDFS-8550:


Hi [~zhz], Attached patch to resolve the findbug warnings, would be great if 
you could pitch in and review the patch!

> Erasure Coding: Fix FindBugs Multithreaded correctness Warning
> --
>
> Key: HDFS-8550
> URL: https://issues.apache.org/jira/browse/HDFS-8550
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8550-HDFS-7285-00.patch, 
> HDFS-8550-HDFS-7285-01.patch
>
>
> Please find the findbug warnings 
> [report|https://builds.apache.org/job/PreCommit-HDFS-Build/12444/artifact/patchprocess/patchFindbugsWarningshadoop-hdfs.html]
> 1) {code}
> Bug type IS2_INCONSISTENT_SYNC (click for details) 
> In class org.apache.hadoop.hdfs.DFSStripedInputStream
> Field org.apache.hadoop.hdfs.DFSStripedInputStream.curStripeBuf
> Synchronized 90% of the time
> Unsynchronized access at DFSStripedInputStream.java:[line 829]
> Synchronized access at DFSStripedInputStream.java:[line 183]
> Synchronized access at DFSStripedInputStream.java:[line 186]
> Synchronized access at DFSStripedInputStream.java:[line 184]
> Synchronized access at DFSStripedInputStream.java:[line 382]
> Synchronized access at DFSStripedInputStream.java:[line 460]
> Synchronized access at DFSStripedInputStream.java:[line 461]
> Synchronized access at DFSStripedInputStream.java:[line 461]
> Synchronized access at DFSStripedInputStream.java:[line 285]
> Synchronized access at DFSStripedInputStream.java:[line 297]
> Synchronized access at DFSStripedInputStream.java:[line 298]
> {code}
> 2) 
> {code}
> Unread field: 
> org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock
> Bug type URF_UNREAD_FIELD (click for details) 
> In class org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo
> Field org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock
> At DFSStripedInputStream.java:[line 126]
> {code}
> 3) 
> {code}
> Unchecked/unconfirmed cast from org.apache.hadoop.hdfs.protocol.LocatedBlock 
> to org.apache.hadoop.hdfs.protocol.LocatedStripedBlock in 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.setBlockToken(LocatedBlock,
>  BlockTokenIdentifier$AccessMode)
> Bug type BC_UNCONFIRMED_CAST (click for details) 
> In class org.apache.hadoop.hdfs.server.blockmanagement.BlockManager
> In method 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.setBlockToken(LocatedBlock,
>  BlockTokenIdentifier$AccessMode)
> Actual type org.apache.hadoop.hdfs.protocol.LocatedBlock
> Expected org.apache.hadoop.hdfs.protocol.LocatedStripedBlock
> Value loaded from b
> At BlockManager.java:[line 974]
> {code}
> 4) 
> {code}
> Result of integer multiplication cast to long in 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(ErasureCodingPolicy,
>  int, LocatedStripedBlock, long, long, ByteBuffer)
> Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
> In class org.apache.hadoop.hdfs.util.StripedBlockUtil
> In method 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(ErasureCodingPolicy,
>  int, LocatedStripedBlock, long, long, ByteBuffer)
> At StripedBlockUtil.java:[line 375]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-17 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-9053:
-
Comment: was deleted

(was: \\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 54s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:red}-1{color} | javac |   7m 59s | The applied patch generated  28  
additional warning messages. |
| {color:green}+1{color} | javadoc |  10m  9s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 30s | The applied patch generated  
16 new checkstyle issues (total was 0, now 16). |
| {color:red}-1{color} | whitespace |   0m  8s | The patch has 7  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 22s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  22m 23s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  43m 47s | Tests failed in hadoop-hdfs. |
| | | 111m  4s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.net.TestClusterTopology |
|   | hadoop.hdfs.TestFileStatus |
|   | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
|   | hadoop.hdfs.server.namenode.TestINodeFile |
|   | hadoop.fs.contract.hdfs.TestHDFSContractOpen |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart |
|   | hadoop.hdfs.server.namenode.ha.TestHASafeMode |
|   | hadoop.hdfs.TestEncryptionZonesWithHA |
|   | hadoop.fs.contract.hdfs.TestHDFSContractMkdir |
|   | hadoop.hdfs.server.namenode.TestNameNodeXAttr |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.server.namenode.ha.TestDelegationTokensWithHA |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages |
|   | hadoop.hdfs.server.datanode.TestCachingStrategy |
|   | hadoop.hdfs.server.namenode.TestFileJournalManager |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.cli.TestXAttrCLI |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.server.namenode.TestParallelImageWrite |
|   | hadoop.hdfs.server.namenode.TestNameNodeRespectsBindHostKeys |
|   | hadoop.hdfs.server.namenode.TestNNStorageRetentionFunctional |
|   | hadoop.hdfs.server.namenode.TestSaveNamespace |
|   | hadoop.hdfs.TestDFSRename |
|   | hadoop.hdfs.util.TestDiff |
|   | hadoop.hdfs.server.datanode.TestDataNodeFSDataSetSink |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotManager |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.server.namenode.ha.TestHarFileSystemWithHA |
|   | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.server.datanode.TestDeleteBlockPool |
|   | hadoop.hdfs.TestRemoteBlockReader2 |
|   | hadoop.hdfs.server.namenode.TestStorageRestore |
|   | hadoop.hdfs.server.namenode.TestFileLimit |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.fs.contract.hdfs.TestHDFSContractSetTimes |
|   | hadoop.hdfs.server.namenode.snapshot.TestCheckpointsWithSnapshots |
|   | hadoop.hdfs.server.namenode.TestFSPermissionChecker |
|   | hadoop.hdfs.server.namenode.TestSecureNameNode |
|   | hadoop.hdfs.server.namenode.TestFileContextAcl |
|   | hadoop.hdfs.server.datanode.TestDataXceiverLazyPersistHint |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestDataTransferProtocol |
|   | hadoop.fs.viewfs.TestViewFsWithXAttrs |
|   | hadoop.hdfs.server.blockmanagement.TestDatanodeManager |
|   | hadoop.hdfs.server.datanode.TestDiskError |
|   | hadoop.hdfs.server.namenode.TestMalformedURLs |
|   | hadoop.hdfs.TestReadWhileWriting |
|   | hadoop.fs.TestSWebHdfsFileContextMainOperations |
|   | hadoop.hdfs.TestIsMethodSupported |
|   | hadoop.hdfs.TestParallelShortCircuitReadNoChecksum |
|   | hadoop.hdfs.server.blockmanagement.TestAvailableSpaceBlockPlacementPolicy 
|
|   | hadoop.hdfs.TestFileCreationClient |
|   | 

[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-17 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802729#comment-14802729
 ] 

Yi Liu commented on HDFS-9053:
--

The failed one test is not related to this patch.

> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the size grows large enough, it will split automatically, 
> and if elements are removed, then B-Tree nodes can merge automatically (see 
> more: https://en.wikipedia.org/wiki/B-tree).  It will solve the above 3 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning externa

2015-09-17 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802746#comment-14802746
 ] 

Hari Sekhon commented on HDFS-8341:
---

[~szetszwo] I believe this ticket is still valid:

There were holes in the data since that storage tier had replication factor 1 
as the replication was supposed to be handled within the proprietary hyperscale 
storage solution underpinning that tier so there was no point in storing 
multiple HDFS replicas there. So if a given block's checksum failed, HDFS Mover 
looped on that block (probably hoping to find other valid block replicas to use 
but there were no other replicas so it was stuck looping on the one corrupt 
replica) and never got past that block so it didn't transfer the rest of the 
data's other blocks.

This would be the same problem if all replicas were corrupt or if a block was 
under replicated (which happens often) and the existing replica was corrupt.

So this jira is still valid - if HDFS Mover can't find a valid/non-corrupt 
replica then it doesn't proceed to move the rest of the other blocks, which 
prevented decommissioning of this storage tier. This is the reason I scripted a 
custom recovery job under the hood of Hadoop since the other blocks were fine 
and it was leaving a lot of data behind on the external storage tier.

> (Summary & Description may be invalid) HDFS mover stuck in loop after failing 
> to move block, doesn't move rest of blocks, can't get data back off 
> decommissioning external storage tier as a result
> ---
>
> Key: HDFS-8341
> URL: https://issues.apache.org/jira/browse/HDFS-8341
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
>
> HDFS mover gets stuck looping on a block that fails to move and doesn't 
> migrate the rest of the blocks.
> This is preventing recovery of data from a decomissioning external storage 
> tier used for archive (we've had problems with that proprietary "hyperscale" 
> storage product which is why a couple blocks here and there have checksum 
> problems or premature eof as shown below), but this should not prevent moving 
> all the other blocks to recover our data:
> {code}hdfs mover -p /apps/hive/warehouse/
> 15/05/07 14:52:50 INFO mover.Mover: namenodes = 
> {hdfs://nameservice1=[/apps/hive/warehouse/]}
> 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> 
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning external

2015-09-17 Thread Hari Sekhon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon reopened HDFS-8341:
---

> (Summary & Description may be invalid) HDFS mover stuck in loop after failing 
> to move block, doesn't move rest of blocks, can't get data back off 
> decommissioning external storage tier as a result
> ---
>
> Key: HDFS-8341
> URL: https://issues.apache.org/jira/browse/HDFS-8341
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
>
> HDFS mover gets stuck looping on a block that fails to move and doesn't 
> migrate the rest of the blocks.
> This is preventing recovery of data from a decomissioning external storage 
> tier used for archive (we've had problems with that proprietary "hyperscale" 
> storage product which is why a couple blocks here and there have checksum 
> problems or premature eof as shown below), but this should not prevent moving 
> all the other blocks to recover our data:
> {code}hdfs mover -p /apps/hive/warehouse/
> 15/05/07 14:52:50 INFO mover.Mover: namenodes = 
> {hdfs://nameservice1=[/apps/hive/warehouse/]}
> 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> 
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8341) HDFS mover stuck in loop trying to move corrupt block with no other valid replicas, doesn't move rest of other data blocks

2015-09-17 Thread Hari Sekhon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HDFS-8341:
--
Summary: HDFS mover stuck in loop trying to move corrupt block with no 
other valid replicas, doesn't move rest of other data blocks  (was: HDFS mover 
stuck in loop trying to move corrupt block with no other valid replicas, 
doesn't move rest of other data blocks, can't get data back off decommissioning 
external storage tier as a result)

> HDFS mover stuck in loop trying to move corrupt block with no other valid 
> replicas, doesn't move rest of other data blocks
> --
>
> Key: HDFS-8341
> URL: https://issues.apache.org/jira/browse/HDFS-8341
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
>
> HDFS mover gets stuck looping on a block that fails to move and doesn't 
> migrate the rest of the blocks.
> This is preventing recovery of data from a decomissioning external storage 
> tier used for archive (we've had problems with that proprietary "hyperscale" 
> storage product which is why a couple blocks here and there have checksum 
> problems or premature eof as shown below), but this should not prevent moving 
> all the other blocks to recover our data:
> {code}hdfs mover -p /apps/hive/warehouse/
> 15/05/07 14:52:50 INFO mover.Mover: namenodes = 
> {hdfs://nameservice1=[/apps/hive/warehouse/]}
> 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> 
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8550) Erasure Coding: Fix FindBugs Multithreaded correctness Warning

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802743#comment-14802743
 ] 

Hadoop QA commented on HDFS-8550:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 48s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 45s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 55s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 31s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 33s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 11s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 225m 53s | Tests failed in hadoop-hdfs. |
| | | 268m 12s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.TestWriteStripedFileWithFailure |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756408/HDFS-8550-HDFS-7285-01.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / ced438a |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12503/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12503/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12503/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12503/console |


This message was automatically generated.

> Erasure Coding: Fix FindBugs Multithreaded correctness Warning
> --
>
> Key: HDFS-8550
> URL: https://issues.apache.org/jira/browse/HDFS-8550
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8550-HDFS-7285-00.patch, 
> HDFS-8550-HDFS-7285-01.patch
>
>
> Please find the findbug warnings 
> [report|https://builds.apache.org/job/PreCommit-HDFS-Build/12444/artifact/patchprocess/patchFindbugsWarningshadoop-hdfs.html]
> 1) {code}
> Bug type IS2_INCONSISTENT_SYNC (click for details) 
> In class org.apache.hadoop.hdfs.DFSStripedInputStream
> Field org.apache.hadoop.hdfs.DFSStripedInputStream.curStripeBuf
> Synchronized 90% of the time
> Unsynchronized access at DFSStripedInputStream.java:[line 829]
> Synchronized access at DFSStripedInputStream.java:[line 183]
> Synchronized access at DFSStripedInputStream.java:[line 186]
> Synchronized access at DFSStripedInputStream.java:[line 184]
> Synchronized access at DFSStripedInputStream.java:[line 382]
> Synchronized access at DFSStripedInputStream.java:[line 460]
> Synchronized access at DFSStripedInputStream.java:[line 461]
> Synchronized access at DFSStripedInputStream.java:[line 461]
> Synchronized access at DFSStripedInputStream.java:[line 285]
> Synchronized access at DFSStripedInputStream.java:[line 297]
> Synchronized access at DFSStripedInputStream.java:[line 298]
> {code}
> 2) 
> {code}
> Unread field: 
> org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock
> Bug type URF_UNREAD_FIELD (click for details) 
> In class org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo
> Field org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock
> At DFSStripedInputStream.java:[line 126]
> {code}
> 3) 
> {code}
> Unchecked/unconfirmed cast from org.apache.hadoop.hdfs.protocol.LocatedBlock 
> to 

[jira] [Updated] (HDFS-8341) HDFS mover stuck in loop trying to move corrupt block with no other valid replicas, doesn't move rest of other data blocks, can't get data back off decommissioning externa

2015-09-17 Thread Hari Sekhon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HDFS-8341:
--
Summary: HDFS mover stuck in loop trying to move corrupt block with no 
other valid replicas, doesn't move rest of other data blocks, can't get data 
back off decommissioning external storage tier as a result  (was: (Summary & 
Description may be invalid) HDFS mover stuck in loop after failing to move 
block, doesn't move rest of blocks, can't get data back off decommissioning 
external storage tier as a result)

> HDFS mover stuck in loop trying to move corrupt block with no other valid 
> replicas, doesn't move rest of other data blocks, can't get data back off 
> decommissioning external storage tier as a result
> -
>
> Key: HDFS-8341
> URL: https://issues.apache.org/jira/browse/HDFS-8341
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
>
> HDFS mover gets stuck looping on a block that fails to move and doesn't 
> migrate the rest of the blocks.
> This is preventing recovery of data from a decomissioning external storage 
> tier used for archive (we've had problems with that proprietary "hyperscale" 
> storage product which is why a couple blocks here and there have checksum 
> problems or premature eof as shown below), but this should not prevent moving 
> all the other blocks to recover our data:
> {code}hdfs mover -p /apps/hive/warehouse/
> 15/05/07 14:52:50 INFO mover.Mover: namenodes = 
> {hdfs://nameservice1=[/apps/hive/warehouse/]}
> 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> 
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/:1019
> 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move 
> blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to 
> :1019:DISK through :1019: block move is failed: opReplaceBlock 
> BP-120244285--1417023863606:blk_1075156654_1438349 received exception 
> java.io.EOFException: Premature EOF: no length prefix available
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3107) HDFS truncate

2015-09-17 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791779#comment-14791779
 ] 

Konstantin Shvachko commented on HDFS-3107:
---

I would say absolutely go for it. As that was one of the motivations for 
truncate to support nfs and fuse apis.
Let's check with the authors what is the state of nfs with respect to truncate.
[~brandonli] could you please elaborate.

> HDFS truncate
> -
>
> Key: HDFS-3107
> URL: https://issues.apache.org/jira/browse/HDFS-3107
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Lei Chang
>Assignee: Plamen Jeliazkov
> Fix For: 2.7.0
>
> Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
> HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
> HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
> HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
> HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791829#comment-14791829
 ] 

Hadoop QA commented on HDFS-9053:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:red}-1{color} | javac |   7m 35s | The applied patch generated  28  
additional warning messages. |
| {color:green}+1{color} | javadoc |   9m 50s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 45s | The applied patch generated  
16 new checkstyle issues (total was 0, now 16). |
| {color:red}-1{color} | whitespace |   0m 11s | The patch has 7  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 15s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  22m 29s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 160m 56s | Tests failed in hadoop-hdfs. |
| | | 228m 34s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.web.TestWebHDFSOAuth2 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756270/HDFS-9053.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0832b38 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/diffJavacWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12501/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12501/console |


This message was automatically generated.

> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the 

[jira] [Updated] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes

2015-09-17 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8632:
---
Status: Patch Available  (was: Open)

> Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
> --
>
> Key: HDFS-8632
> URL: https://issues.apache.org/jira/browse/HDFS-8632
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8632-HDFS-7285-00.patch, 
> HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch
>
>
> I've noticed some of the erasure coding classes missing 
> {{@InterfaceAudience}} annotation. It would be good to identify the classes 
> and add proper annotation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8873) throttle directoryScanner

2015-09-17 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-8873:
---
Attachment: HDFS-8873.003.patch

Reposting patch.

> throttle directoryScanner
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, 
> HDFS-8873.003.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8873) throttle directoryScanner

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803149#comment-14803149
 ] 

Hadoop QA commented on HDFS-8873:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 13s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 53s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 14s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 34s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  7s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 41s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 34s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 130m  1s | Tests failed in hadoop-hdfs. |
| | | 173m 34s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestParallelShortCircuitRead |
|   | hadoop.hdfs.server.namenode.TestAllowFormat |
|   | hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens |
|   | hadoop.hdfs.TestBlockStoragePolicy |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotMetrics |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshottableDirListing |
|   | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots |
|   | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.TestRemoteBlockReader2 |
|   | hadoop.hdfs.server.namenode.TestStartup |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.TestDFSStorageStateRecovery |
|   | hadoop.hdfs.server.namenode.TestFSImageWithXAttr |
|   | hadoop.hdfs.TestRemoteBlockReader |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestBlockReaderLocal |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks |
|   | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality |
|   | hadoop.hdfs.server.namenode.TestFSImageWithAcl |
|   | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
|   | hadoop.hdfs.crypto.TestHdfsCryptoStreams |
|   | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl |
|   | hadoop.hdfs.TestDFSAddressConfig |
|   | hadoop.hdfs.server.namenode.TestFSDirectory |
|   | hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots |
|   | hadoop.hdfs.TestParallelRead |
|   | hadoop.hdfs.TestRestartDFS |
|   | hadoop.hdfs.TestParallelShortCircuitReadNoChecksum |
|   | hadoop.hdfs.TestParallelShortCircuitLegacyRead |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.hdfs.server.namenode.TestDeadDatanode |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.ha.TestFailoverWithBlockTokensEnabled |
|   | hadoop.hdfs.TestFetchImage |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencing |
|   | hadoop.hdfs.TestDFSUpgrade |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.TestMissingBlocksAlert |
|   | hadoop.hdfs.server.namenode.ha.TestHAMetrics |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot |
|   | hadoop.hdfs.server.namenode.TestQuotaByStorageType |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion |
|   | hadoop.hdfs.server.namenode.TestStorageRestore |
|   | hadoop.hdfs.server.namenode.TestSaveNamespace |
|   | hadoop.hdfs.server.namenode.TestParallelImageWrite |
|   | hadoop.hdfs.tools.TestDebugAdmin |
|   | hadoop.hdfs.TestPersistBlocks |

[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803187#comment-14803187
 ] 

Hadoop QA commented on HDFS-9095:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756394/HDFS-9095.000.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / 58d1a02 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12508/console |


This message was automatically generated.

> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes

2015-09-17 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8632:
---
Attachment: HDFS-8632-HDFS-7285-03.patch

> Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
> --
>
> Key: HDFS-8632
> URL: https://issues.apache.org/jira/browse/HDFS-8632
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8632-HDFS-7285-00.patch, 
> HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch, 
> HDFS-8632-HDFS-7285-03.patch
>
>
> I've noticed some of the erasure coding classes missing 
> {{@InterfaceAudience}} annotation. It would be good to identify the classes 
> and add proper annotation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client

2015-09-17 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803170#comment-14803170
 ] 

Haohui Mai commented on HDFS-9022:
--

The patch looks good to me.

bq. This is expected as we will file a new jira to replace the 
NameNode.getUri() with DFSUtilClient.getNNUri(). See comments above

Can you please file the jira and link it to this jira?

> Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
> --
>
> Key: HDFS-9022
> URL: https://issues.apache.org/jira/browse/HDFS-9022
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client, namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, 
> HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch
>
>
> The static helper methods in NameNodes are used in {{hdfs-client}} module. 
> For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes 
> which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should 
> keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module.
> This jira tracks the effort of moving the following static helper methods out 
> of  {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these 
> methods is the {{DFSUtilClient}} class:
> {code}
> public static InetSocketAddress getAddress(String address);
> public static InetSocketAddress getAddress(Configuration conf);
> public static InetSocketAddress getAddress(URI filesystemURI);
> public static URI getUri(InetSocketAddress namenode);
> {code}
> Be cautious not to bring new checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9093) Initialize protobuf fields in RemoteBlockReaderTest

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803192#comment-14803192
 ] 

Hadoop QA commented on HDFS-9093:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756393/HDFS-9093.000.patch |
| Optional Tests | javac unit |
| git revision | trunk / 58d1a02 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12510/console |


This message was automatically generated.

> Initialize protobuf fields in RemoteBlockReaderTest
> ---
>
> Key: HDFS-9093
> URL: https://issues.apache.org/jira/browse/HDFS-9093
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9093.000.patch
>
>
> Protobuf 2.6.1 complains that the {{ExtendedBlockProto}} objects in 
> {{remote_block_reader_test.cc}} are not initialized.
> The test should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset

2015-09-17 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9095:
-
Status: Patch Available  (was: Open)

> RPC client should fail gracefully when the connection is timed out or reset
> ---
>
> Key: HDFS-9095
> URL: https://issues.apache.org/jira/browse/HDFS-9095
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9095.000.patch
>
>
> The RPC client should fail gracefully when the connection is timed out or 
> reset. instead of bailing out. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9093) Initialize protobuf fields in RemoteBlockReaderTest

2015-09-17 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9093:
-
Status: Patch Available  (was: Open)

> Initialize protobuf fields in RemoteBlockReaderTest
> ---
>
> Key: HDFS-9093
> URL: https://issues.apache.org/jira/browse/HDFS-9093
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-9093.000.patch
>
>
> Protobuf 2.6.1 complains that the {{ExtendedBlockProto}} objects in 
> {{remote_block_reader_test.cc}} are not initialized.
> The test should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes

2015-09-17 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803173#comment-14803173
 ] 

Rakesh R commented on HDFS-8632:


It seems there are few [findbug warnings on LocatedStripedBlock 
class|https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/patchFindbugsWarningshadoop-hdfs-client.html#Warnings_MALICIOUS_CODE].
 Attached another patch fixing the same.

> Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
> --
>
> Key: HDFS-8632
> URL: https://issues.apache.org/jira/browse/HDFS-8632
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8632-HDFS-7285-00.patch, 
> HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch, 
> HDFS-8632-HDFS-7285-03.patch
>
>
> I've noticed some of the erasure coding classes missing 
> {{@InterfaceAudience}} annotation. It would be good to identify the classes 
> and add proper annotation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8873) throttle directoryScanner

2015-09-17 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802910#comment-14802910
 ] 

Kihwal Lee commented on HDFS-8873:
--

Shall we target 2.7.2? 

> throttle directoryScanner
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, 
> HDFS-8873.003.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802949#comment-14802949
 ] 

Hadoop QA commented on HDFS-8632:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m  2s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 43s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 15s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 10s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 10s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 42s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   6m 51s | The patch appears to introduce 7 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  23m 40s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |   0m 22s | Tests failed in hadoop-hdfs. |
| {color:red}-1{color} | hdfs tests |   0m 19s | Tests failed in 
hadoop-hdfs-client. |
| | |  72m  8s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| FindBugs | module:hadoop-hdfs-client |
| Failed unit tests | hadoop.fs.contract.localfs.TestLocalFSContractMkdir |
| Failed build | hadoop-hdfs |
|   | hadoop-hdfs-client |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756142/HDFS-8632-HDFS-7285-02.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / ced438a |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-client.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12505/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12505/console |


This message was automatically generated.

> Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
> --
>
> Key: HDFS-8632
> URL: https://issues.apache.org/jira/browse/HDFS-8632
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8632-HDFS-7285-00.patch, 
> HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch
>
>
> I've noticed some of the erasure coding classes missing 
> {{@InterfaceAudience}} annotation. It would be good to identify the classes 
> and add proper annotation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8873) throttle directoryScanner

2015-09-17 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-8873:
---
Attachment: (was: HDFS-8873.003.patch)

> throttle directoryScanner
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy

2015-09-17 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-9097:
---

 Summary: Erasure coding: update EC command "-s" flag to "-p" when 
specifying policy
 Key: HDFS-9097
 URL: https://issues.apache.org/jira/browse/HDFS-9097
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Zhe Zhang
Assignee: Zhe Zhang


HDFS-8833 missed this update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8647) Abstract BlockManager's rack policy into BlockPlacementPolicy

2015-09-17 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803246#comment-14803246
 ] 

Brahma Reddy Battula commented on HDFS-8647:


{{TestBlocksWithNotEnoughRacks}} are failing due to following cases.

 *Before change* 

 if cluster was multi rack first and then later became single rack, then those 
blocks were getting added in {{NeededReplications}}, and the same value was 
expected in tests. 
   a. But this was only till the namenode was alive. If NN was restarted after 
making into single rack, this {{NeededReplications}} will not have the block.
   b. And before NN restart, another rack was added, then auto replication 
happens to new rack. But Once NN restarted, and new rack added, then auto 
replication to new rack (if single rack have already enough replicas == RF) to 
new rack happens only if RF changes on those blocks.

 *After change*

it will not add to {{NeededReplications}} immediately cluster became the single 
rack.
  a. After change, auto replication (if single rack have already enough 
replicas == RF)  will not happen when the cluster is added with one more rack. 
Only this will be triggered only if RF changes on those blocks.

If new change is Okay, then test case can be updated, else patch can be updated.

> Abstract BlockManager's rack policy into BlockPlacementPolicy
> -
>
> Key: HDFS-8647
> URL: https://issues.apache.org/jira/browse/HDFS-8647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-8647-001.patch, HDFS-8647-002.patch, 
> HDFS-8647-003.patch
>
>
> Sometimes we want to have namenode use alternative block placement policy 
> such as upgrade domains in HDFS-7541.
> BlockManager has built-in assumption about rack policy in functions such as 
> useDelHint, blockHasEnoughRacks. That means when we have new block placement 
> policy, we need to modify BlockManager to account for the new policy. Ideally 
> BlockManager should ask BlockPlacementPolicy object instead. That will allow 
> us to provide new BlockPlacementPolicy without changing BlockManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy

2015-09-17 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9097:

Status: Patch Available  (was: Open)

> Erasure coding: update EC command "-s" flag to "-p" when specifying policy
> --
>
> Key: HDFS-9097
> URL: https://issues.apache.org/jira/browse/HDFS-9097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>
> HDFS-8833 missed this update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy

2015-09-17 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9097:

Attachment: HDFS-9097-HDFS-7285.00.patch

> Erasure coding: update EC command "-s" flag to "-p" when specifying policy
> --
>
> Key: HDFS-9097
> URL: https://issues.apache.org/jira/browse/HDFS-9097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9097-HDFS-7285.00.patch
>
>
> HDFS-8833 missed this update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9063) Correctly handle snapshot path for getContentSummary

2015-09-17 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803241#comment-14803241
 ] 

Jing Zhao commented on HDFS-9063:
-

Thanks Yi! {{getContentSummary(dir's current path)}} should include (all the 
current files/directories) + (all the deleted files/directories but still in 
snapshots). Thus in the above case, the return value 16 in step 6 is correct: 
we have 15 files in the current dir, and the original first file in 
dir/.snapshot/s1.

> Correctly handle snapshot path for getContentSummary
> 
>
> Key: HDFS-9063
> URL: https://issues.apache.org/jira/browse/HDFS-9063
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9063.000.patch
>
>
> The current getContentSummary implementation does not take into account the 
> snapshot path, thus if we have the following ops:
> 1. create dirs /foo/bar
> 2. take snapshot s1 on /foo
> 3. create a 1 byte file /foo/bar/baz
> then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", 
> which is incorrect since the 1 byte file is not included in snapshot s1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Attachment: (was: HDFS-5802.002.patch)

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline

2015-09-17 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-9098:
---

 Summary: Erasure coding: emulate race conditions among striped 
streamers in write pipeline
 Key: HDFS-9098
 URL: https://issues.apache.org/jira/browse/HDFS-9098
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang


Apparently the interleaving of events among {{StripedDataStreamer}}s is very 
tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
conditions under HDFS-9040.

Let's use FaultInjector to emulate different combinations of interleaved events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9040) Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests to Coordinator)

2015-09-17 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803316#comment-14803316
 ] 

Zhe Zhang commented on HDFS-9040:
-

bq. 3. We might want to add the logic to replace a failed StripedDataStreamer 
in the future.
bq. No, we won't. I think so? if you're talking something like Datanode 
replacement for repl block. You can transfer a healthy repl RBW to a new 
Datanode, then you still get 3 DNs after replacement. But recover a corrupted 
RBW internal block is difficult.
I agree it's difficult and in this phase I don't think it's necessary. We 
cannot rule out the possibility though. In current non-EC pipeline we support 
multiple failover options. A fast writer can opt out in DN replacement and 
instead rely on background re-replication. A slow writer might want to replace 
DN to prevent data loss during the long window. For a slow EC writer we should 
consider fixing the pipeline as well, especially at the early stage of writing 
a block (not too much data to decode).

bq. 1. A client read UC block being written can decode replica if it misses 
some part. ( With checksum verification, we are only concern about 'missing')
Interesting thought. But {{verifyChecksum}} is optional so we can't always rely 
on it. If {{verifyChecksum}} becomes mandatory much of our corrupt replica 
handling logic can be much simpler.

> Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests 
> to Coordinator)
> ---
>
> Key: HDFS-9040
> URL: https://issues.apache.org/jira/browse/HDFS-9040
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
> Attachments: HDFS-9040-HDFS-7285.002.patch, 
> HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, 
> HDFS-9040.02.bgstreamer.patch
>
>
> The general idea is to simplify error handling logic.
> Proposal 1:
> A BlockGroupDataStreamer to communicate with NN to allocate/update block, and 
> StripedDataStreamer s only have to stream blocks to DNs.
> Proposal 2:
> See below the 
> [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388]
>  from [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"

2015-09-17 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803367#comment-14803367
 ] 

Surendra Singh Lilhore commented on HDFS-9089:
--

bq. Why ".system" is better than "system"? The same argument applies – what if 
users want to create ".system"?

We thought , user will not create directory like "{{.system}}" :)

> Balancer and Mover should use ".system" as reserved inode name instead of 
> "system"
> --
>
> Key: HDFS-9089
> URL: https://issues.apache.org/jira/browse/HDFS-9089
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch
>
>
> Currently Balancer and Mover create "/system" for placing mover.id and 
> balancer.id
> hdfs dfs -ls /
> drwxr-xr-x   - root hadoop  0 2015-09-16 12:49 
> {color:red}/system{color}
> This folder created in not deleted once mover or balancer work is completed 
> So user cannot create dir "system" .
> Its better to make ".system" as reserved inode for balancer and mover instead 
> of "system".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803375#comment-14803375
 ] 

Hadoop QA commented on HDFS-8632:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m  1s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 20s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  6s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 10s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   6m 32s | The patch appears to introduce 4 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m 49s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  41m  4s | Tests failed in hadoop-hdfs. |
| {color:red}-1{color} | hdfs tests |   0m 20s | Tests failed in 
hadoop-hdfs-client. |
| | | 112m 42s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | 
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForContentSummary |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.TestModTime |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.datanode.TestReadOnlySharedStorage |
|   | hadoop.hdfs.server.datanode.TestIncrementalBlockReports |
|   | hadoop.hdfs.TestReservedRawPaths |
|   | hadoop.hdfs.server.namenode.TestClusterId |
|   | hadoop.hdfs.server.namenode.ha.TestFailoverWithBlockTokensEnabled |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlockQueues |
|   | hadoop.hdfs.web.TestHttpsFileSystem |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.TestRemoteBlockReader2 |
|   | hadoop.hdfs.protocol.TestBlockListAsLongs |
|   | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotRename |
|   | hadoop.hdfs.TestReplication |
|   | hadoop.hdfs.TestBlocksScheduledCounter |
|   | hadoop.hdfs.qjournal.client.TestQJMWithFaults |
|   | hadoop.hdfs.server.namenode.TestNameNodeAcl |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
|   | hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | hadoop.hdfs.TestAbandonBlock |
|   | hadoop.hdfs.TestSetTimes |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogsDuringFailover |
|   | hadoop.hdfs.server.namenode.TestNameEditsConfigs |
|   | hadoop.hdfs.TestDFSFinalize |
|   | hadoop.hdfs.web.TestFSMainOperationsWebHdfs |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyBlockManagement |
|   | hadoop.hdfs.server.namenode.TestCommitBlockSynchronization |
|   | hadoop.hdfs.server.namenode.TestNameNodeRecovery |
|   | hadoop.hdfs.server.namenode.TestAuditLogs |
|   | hadoop.hdfs.server.namenode.TestStorageRestore |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits |
|   | hadoop.hdfs.TestAppendDifferentChecksum |
|   | hadoop.hdfs.server.namenode.TestEditLogRace |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.server.namenode.TestNameNodeResourceChecker |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.TestHDFSFileSystemContract |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport |
|   | hadoop.hdfs.TestDFSShell |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | hadoop.hdfs.TestMissingBlocksAlert |
|   | 

[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Status: Patch Available  (was: Open)

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, 
> HDFS-5802.003.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Attachment: HDFS-5802.003.patch

Thanks Yongjun for the additional comments!
I have fixed it and uploaded patch 003.

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, 
> HDFS-5802.003.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7492) If multiple threads call FsVolumeList#checkDirs at the same time, we should only do checkDirs once and give the results to all waiting threads

2015-09-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803419#comment-14803419
 ] 

Colin Patrick McCabe commented on HDFS-7492:


[~eclark], also check HDFS-8845 for another improvement in this area.

> If multiple threads call FsVolumeList#checkDirs at the same time, we should 
> only do checkDirs once and give the results to all waiting threads
> --
>
> Key: HDFS-7492
> URL: https://issues.apache.org/jira/browse/HDFS-7492
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Colin Patrick McCabe
>Assignee: Elliott Clark
>Priority: Minor
>
> checkDirs is called when we encounter certain I/O errors.  It's rare to get 
> just a single I/O error... normally you start getting many errors when a disk 
> is going bad.  For this reason, we shouldn't start a new checkDirs scan for 
> each error.  Instead, if multiple threads call FsVolumeList#checkDirs at 
> around the same time, we should only do checkDirs once and give the results 
> to all the waiting threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9063) Correctly handle snapshot path for getContentSummary

2015-09-17 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803250#comment-14803250
 ] 

Jing Zhao commented on HDFS-9063:
-

The test failures look suspicious. I just triggered the Jenkins again.

> Correctly handle snapshot path for getContentSummary
> 
>
> Key: HDFS-9063
> URL: https://issues.apache.org/jira/browse/HDFS-9063
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9063.000.patch
>
>
> The current getContentSummary implementation does not take into account the 
> snapshot path, thus if we have the following ops:
> 1. create dirs /foo/bar
> 2. take snapshot s1 on /foo
> 3. create a 1 byte file /foo/bar/baz
> then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", 
> which is incorrect since the 1 byte file is not included in snapshot s1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-09-17 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803321#comment-14803321
 ] 

Zhe Zhang commented on HDFS-8808:
-

Triggering Jenkins again.

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"

2015-09-17 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803332#comment-14803332
 ] 

Surendra Singh Lilhore commented on HDFS-9089:
--

Thanks [~szetszwo] for comment.

The main purpose of this jira is to delete directory {{system}} which is 
created by Mover and Balancer after completing task, but we can't delete 
{{system}} directory directly because it may be created by user. We thought we 
can use one reserve directory name like {{.system}}, so we can delete it after 
completing mover and balancer task.

bq. This looks like an incompatible change.

I didn't get, how it is incompatible ?.

> Balancer and Mover should use ".system" as reserved inode name instead of 
> "system"
> --
>
> Key: HDFS-9089
> URL: https://issues.apache.org/jira/browse/HDFS-9089
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch
>
>
> Currently Balancer and Mover create "/system" for placing mover.id and 
> balancer.id
> hdfs dfs -ls /
> drwxr-xr-x   - root hadoop  0 2015-09-16 12:49 
> {color:red}/system{color}
> This folder created in not deleted once mover or balancer work is completed 
> So user cannot create dir "system" .
> Its better to make ".system" as reserved inode for balancer and mover instead 
> of "system".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-09-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803351#comment-14803351
 ] 

Owen O'Malley commented on HDFS-8855:
-

I'm looking at the patch, but you'll need to resolve the checkstyle, findbugs, 
and test case failures.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, 
> HDFS-8855.4.patch, HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8968) New benchmark throughput tool for striping erasure coding

2015-09-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803386#comment-14803386
 ] 

Andrew Wang commented on HDFS-8968:
---

In that case maybe we throw it in hadoop-tools? The only concern there is that 
without unittests the code won't be exercised regularly, and it might get out 
of date.

I think we could make it run against both a real cluster and a MiniDFSCluster 
also, since ultimately we're just using the FileSystem API.

> New benchmark throughput tool for striping erasure coding
> -
>
> Key: HDFS-8968
> URL: https://issues.apache.org/jira/browse/HDFS-8968
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Rui Li
> Attachments: HDFS-8968-HDFS-7285.1.patch, HDFS-8968-HDFS-7285.2.patch
>
>
> We need a new benchmark tool to measure the throughput of client writing and 
> reading considering cases or factors:
> * 3-replica or striping;
> * write or read, stateful read or positional read;
> * which erasure coder;
> * striping cell size;
> * concurrent readers/writers using processes or threads.
> The tool should be easy to use and better to avoid unnecessary local 
> environment impact, like local disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy

2015-09-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803396#comment-14803396
 ] 

Andrew Wang commented on HDFS-9097:
---

+1 LGTM, thanks for the update Zhe

> Erasure coding: update EC command "-s" flag to "-p" when specifying policy
> --
>
> Key: HDFS-9097
> URL: https://issues.apache.org/jira/browse/HDFS-9097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9097-HDFS-7285.00.patch
>
>
> HDFS-8833 missed this update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803388#comment-14803388
 ] 

Yongjun Zhang commented on HDFS-5802:
-

Hi [~xiaochen],

Thanks for the new rev. Rev 2 looks good to me, two small nits
{code}
195   /**
196* Check whether an exception is due to inode type not directory
197*/
198   private void checkAncestorType(INode[] inodes, int ancestorIndex,
199 AccessControlException e) throws 
AccessControlException {
{code}

1.  Suggest to change the comment above to 
   "Check whether exception e is due to an ancestor inode's not being 
directory"
2.  indention of line 199 should be 4

+1 after that,  pending jenkins tests.

Thanks.


> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline

2015-09-17 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9098:

Description: 
Apparently the interleaving of events among {{StripedDataStreamer}}'s is very 
tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
conditions under HDFS-9040.

Let's use FaultInjector to emulate different combinations of interleaved events.

In particular, we should consider inject delays in the following places:
# {{Streamer#endBlock}}
# {{Streamer#locateFollowingBlock}}
# {{Streamer#updateBlockForPipeline}}
# {{Streamer#updatePipeline}}
# {{OutputStream#writeChunk}}
# {{OutputStream#close}}

  was:
Apparently the interleaving of events among {{StripedDataStreamer}}s is very 
tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
conditions under HDFS-9040.

Let's use FaultInjector to emulate different combinations of interleaved events.


> Erasure coding: emulate race conditions among striped streamers in write 
> pipeline
> -
>
> Key: HDFS-9098
> URL: https://issues.apache.org/jira/browse/HDFS-9098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>
> Apparently the interleaving of events among {{StripedDataStreamer}}'s is very 
> tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
> conditions under HDFS-9040.
> Let's use FaultInjector to emulate different combinations of interleaved 
> events.
> In particular, we should consider inject delays in the following places:
> # {{Streamer#endBlock}}
> # {{Streamer#locateFollowingBlock}}
> # {{Streamer#updateBlockForPipeline}}
> # {{Streamer#updatePipeline}}
> # {{OutputStream#writeChunk}}
> # {{OutputStream#close}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9088) Cleanup erasure coding documentation

2015-09-17 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9088:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7285
   Status: Resolved  (was: Patch Available)

I just committed to the feature branch. Thanks Andrew for the work!

> Cleanup erasure coding documentation
> 
>
> Key: HDFS-9088
> URL: https://issues.apache.org/jira/browse/HDFS-9088
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: HDFS-7285
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: HDFS-7285
>
> Attachments: hdfs-9088.001.patch, hdfs-9088.002.patch
>
>
> The documentation could use a pass to clean up typos, unify formatting, and 
> also make it more user-oriented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Attachment: HDFS-5802.002.patch

Thanks Yongjun for the review! It makes sense, and I've addressed comment 1-4. 
(The code was 80 chars, so no action for 5)
Submit incremental patch 002 with my changes.

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Status: Open  (was: Patch Available)

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Attachment: HDFS-5802.002.patch

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9004) Add upgrade domain to DatanodeInfo

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803271#comment-14803271
 ] 

Hadoop QA commented on HDFS-9004:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 16s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 35s | The applied patch generated  5 
new checkstyle issues (total was 124, now 127). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 36s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 129m  0s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 32s | Tests passed in 
hadoop-hdfs-client. |
| | | 180m 45s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestStorageRestore |
|   | org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot 
|
|   | org.apache.hadoop.hdfs.server.namenode.TestNameEditsConfigs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756094/HDFS-9004-2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6c6e734 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12507/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12507/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12507/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12507/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12507/console |


This message was automatically generated.

> Add upgrade domain to DatanodeInfo
> --
>
> Key: HDFS-9004
> URL: https://issues.apache.org/jira/browse/HDFS-9004
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9004-2.patch, HDFS-9004.patch
>
>
> As part of upgrade domain feature, we first need to add upgrade domain string 
> to {{DatanodeInfo}}. It includes things like:
> * Add a new field to DatanodeInfo.
> * Modify protobuf for DatanodeInfo.
> * Update DatanodeInfo.getDatanodeReport to include upgrade domain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Status: Open  (was: Patch Available)

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804288#comment-14804288
 ] 

Hadoop QA commented on HDFS-8808:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m  9s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 10s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 15s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 29s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 31s | The applied patch generated  1 
new checkstyle issues (total was 546, now 546). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 55s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 43s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 44s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |   0m 19s | Tests failed in hadoop-hdfs. |
| | |  49m 32s | |
\\
\\
|| Reason || Tests ||
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751151/HDFS-8808.04.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58d1a02 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12514/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12514/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12514/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12514/console |


This message was automatically generated.

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Status: Patch Available  (was: Open)

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3107) HDFS truncate

2015-09-17 Thread Constantine Peresypkin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803234#comment-14803234
 ] 

Constantine Peresypkin commented on HDFS-3107:
--

Hmm, fast patching nfs did not work. Intermittently fails on "lease already 
acquired".
Seems like nfs gateway holds leases to all files it opened in some sort of 
cache.
Very strange.

> HDFS truncate
> -
>
> Key: HDFS-3107
> URL: https://issues.apache.org/jira/browse/HDFS-3107
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Lei Chang
>Assignee: Plamen Jeliazkov
> Fix For: 2.7.0
>
> Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
> HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
> HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
> HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
> HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Status: Patch Available  (was: Open)

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9085) Show renewer information in DelegationTokenIdentifier#toString

2015-09-17 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803307#comment-14803307
 ] 

zhihai xu commented on HDFS-9085:
-

Thanks for the review [~cnauroth]! That is great information. Yes, it makes 
sense to commit the patch to trunk only.

> Show renewer information in DelegationTokenIdentifier#toString
> --
>
> Key: HDFS-9085
> URL: https://issues.apache.org/jira/browse/HDFS-9085
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Trivial
> Attachments: HDFS-9085.001.patch
>
>
> Show renewer information in {{DelegationTokenIdentifier#toString}}. Currently 
> {{org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier}}
>  didn't show the renewer information. It will be very useful to have renewer 
> information to debug security related issue. Because the renewer will be 
> filtered by "hadoop.security.auth_to_local", it will be helpful to show the 
> real renewer info after applying the rules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client

2015-09-17 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803300#comment-14803300
 ] 

Mingliang Liu commented on HDFS-9022:
-

Thank you [~wheat9] for reviewing the code. I filed jira [MAPREDUCE-6483] for 
changes in {{hadoop-mapreduce}} module.

> Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
> --
>
> Key: HDFS-9022
> URL: https://issues.apache.org/jira/browse/HDFS-9022
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client, namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, 
> HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch
>
>
> The static helper methods in NameNodes are used in {{hdfs-client}} module. 
> For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes 
> which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should 
> keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module.
> This jira tracks the effort of moving the following static helper methods out 
> of  {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these 
> methods is the {{DFSUtilClient}} class:
> {code}
> public static InetSocketAddress getAddress(String address);
> public static InetSocketAddress getAddress(Configuration conf);
> public static InetSocketAddress getAddress(URI filesystemURI);
> public static URI getUri(InetSocketAddress namenode);
> {code}
> Be cautious not to bring new checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8873) throttle directoryScanner

2015-09-17 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804279#comment-14804279
 ] 

Daniel Templeton commented on HDFS-8873:


[sigh]  Those tests pass for me locally, so can't say why they failed.

The whitespace error is interesting.  I changed line n in the patch.  Jenkins 
complained about the whitespace on line n+1.  I fixed the whitespace on line 
n+1 in the next patch.  Jenkins is now complaining about the whitespace on line 
n+2.  There is no issue on line n+3, so I could correct n+2 and be done, but at 
that point I've made whitespace changes on two lines that I didn't otherwise 
touch.  What's the accepted why to do it?  Fix the whitespace or ignore the 
error?

> throttle directoryScanner
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, 
> HDFS-8873.003.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9004) Add upgrade domain to DatanodeInfo

2015-09-17 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804388#comment-14804388
 ] 

Lei (Eddy) Xu commented on HDFS-9004:
-

Hi, [~mingma]

You patch looks very good in general. 

It'd be great to address one small change: is the following change necessary in 
this patch?

{code:title=DFSTestUtil.java}
1l, 2l, 3l, 4l, 0l, 0l, 0l, 5, 6, "local", adminState,
+   ipAddr + ":" + DFSConfigKeys.DFS_DATANODE_DEFAULT_PORT);
{code}

Will +1 once address the above comments and verify the tests failures are not 
relevant.

> Add upgrade domain to DatanodeInfo
> --
>
> Key: HDFS-9004
> URL: https://issues.apache.org/jira/browse/HDFS-9004
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9004-2.patch, HDFS-9004.patch
>
>
> As part of upgrade domain feature, we first need to add upgrade domain string 
> to {{DatanodeInfo}}. It includes things like:
> * Add a new field to DatanodeInfo.
> * Modify protobuf for DatanodeInfo.
> * Update DatanodeInfo.getDatanodeReport to include upgrade domain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7955) Improve naming of classes, methods, and variables related to block replication and recovery

2015-09-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804298#comment-14804298
 ] 

Andrew Wang commented on HDFS-7955:
---

I noticed with HDFS-8899 the datanode config keys need some renaming:

{noformat}
  public static final String  DFS_DATANODE_STRIPED_READ_THREADS_KEY = 
"dfs.datanode.stripedread.threads";
  public static final int DFS_DATANODE_STRIPED_READ_THREADS_DEFAULT = 20;
  public static final String  DFS_DATANODE_STRIPED_READ_BUFFER_SIZE_KEY = 
"dfs.datanode.stripedread.buffer.size";
  public static final int DFS_DATANODE_STRIPED_READ_BUFFER_SIZE_DEFAULT = 
64 * 1024;
  public static final String  DFS_DATANODE_STRIPED_READ_TIMEOUT_MILLIS_KEY = 
"dfs.datanode.stripedread.timeout.millis";
  public static final int DFS_DATANODE_STRIPED_READ_TIMEOUT_MILLIS_DEFAULT 
= 5000; //5s
  public static final String  DFS_DATANODE_STRIPED_BLK_RECOVERY_THREADS_KEY = 
"dfs.datanode.striped.blockrecovery.threads.size";
  public static final int DFS_DATANODE_STRIPED_BLK_RECOVERY_THREADS_DEFAULT 
= 8;
{noformat}

The term "block recovery" is overloaded here, I'd recommend "reconstruction" 
instead. All of these config keys are also for ECWorker and related, so should 
also have the same prefix, e.g. "dfs.datanode.ec.reconstruction" or something. 
IIUC there's a "read" thread pool and a "compute" thread pool; that distinction 
hopefully is also made apparent in the key naming and descriptions.

> Improve naming of classes, methods, and variables related to block 
> replication and recovery
> ---
>
> Key: HDFS-7955
> URL: https://issues.apache.org/jira/browse/HDFS-7955
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Rakesh R
> Attachments: HDFS-7955-001.patch
>
>
> Many existing names should be revised to avoid confusion when blocks can be 
> both replicated and erasure coded. This JIRA aims to solicit opinions on 
> making those names more consistent and intuitive.
> # In current HDFS _block recovery_ refers to the process of finalizing the 
> last block of a file, triggered by _lease recovery_. It is different from the 
> intuitive meaning of _recovering a lost block_. To avoid confusion, I can 
> think of 2 options:
> #* Rename this process as _block finalization_ or _block completion_. I 
> prefer this option because this is literally not a recovery.
> #* If we want to keep existing terms unchanged we can name all EC recovery 
> and re-replication logics as _reconstruction_.  
> # As Kai [suggested | 
> https://issues.apache.org/jira/browse/HDFS-7369?focusedCommentId=14361131=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14361131]
>  under HDFS-7369, several replication-based names should be made more generic:
> #* {{UnderReplicatedBlocks}} and {{neededReplications}}. E.g. we can use 
> {{LowRedundancyBlocks}}/{{AtRiskBlocks}}, and 
> {{neededRecovery}}/{{neededReconstruction}}.
> #* {{PendingReplicationBlocks}}
> #* {{ReplicationMonitor}}
> I'm sure the above list is incomplete; discussions and comments are very 
> welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9047) deprecate libwebhdfs in branch-2; remove from trunk

2015-09-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804319#comment-14804319
 ] 

Colin Patrick McCabe commented on HDFS-9047:


Like I said, I don't have any objection to replacing libwebhdfs with some code 
that's better and does the same job.  I just don't think we should remove it 
with no replacement.

> deprecate libwebhdfs in branch-2; remove from trunk
> ---
>
> Key: HDFS-9047
> URL: https://issues.apache.org/jira/browse/HDFS-9047
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Allen Wittenauer
>
> This library is basically a mess:
> * It's not part of the mvn package
> * It's missing functionality and barely maintained
> * It's not in the precommit runs so doesn't get exercised regularly
> * It's not part of the unit tests (at least, that I can see)
> * It isn't documented in any official documentation
> But most importantly:  
> * It fails at it's primary mission of being pure C (HDFS-3917 is STILL open)
> Let's cut our losses and just remove it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client

2015-09-17 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804459#comment-14804459
 ] 

Haohui Mai commented on HDFS-9022:
--

+1

> Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
> --
>
> Key: HDFS-9022
> URL: https://issues.apache.org/jira/browse/HDFS-9022
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client, namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, 
> HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch
>
>
> The static helper methods in NameNodes are used in {{hdfs-client}} module. 
> For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes 
> which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should 
> keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module.
> This jira tracks the effort of moving the following static helper methods out 
> of  {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these 
> methods is the {{DFSUtilClient}} class:
> {code}
> public static InetSocketAddress getAddress(String address);
> public static InetSocketAddress getAddress(Configuration conf);
> public static InetSocketAddress getAddress(URI filesystemURI);
> public static URI getUri(InetSocketAddress namenode);
> {code}
> Be cautious not to bring new checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9063) Correctly handle snapshot path for getContentSummary

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804482#comment-14804482
 ] 

Hadoop QA commented on HDFS-9063:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  4s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 15s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 24s | The applied patch generated  2 
new checkstyle issues (total was 177, now 179). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 10s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 164m  2s | Tests failed in hadoop-hdfs. |
| | | 209m 56s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.namenode.TestNameNodeResourceChecker |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12756374/HDFS-9063.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58d1a02 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12511/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12511/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12511/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12511/console |


This message was automatically generated.

> Correctly handle snapshot path for getContentSummary
> 
>
> Key: HDFS-9063
> URL: https://issues.apache.org/jira/browse/HDFS-9063
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-9063.000.patch
>
>
> The current getContentSummary implementation does not take into account the 
> snapshot path, thus if we have the following ops:
> 1. create dirs /foo/bar
> 2. take snapshot s1 on /foo
> 3. create a 1 byte file /foo/bar/baz
> then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", 
> which is incorrect since the 1 byte file is not included in snapshot s1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8967) Create a BlockManagerLock class to represent the lock used in the BlockManager

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804485#comment-14804485
 ] 

Hadoop QA commented on HDFS-8967:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12753563/HDFS-8967.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 3f82f58 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12516/console |


This message was automatically generated.

> Create a BlockManagerLock class to represent the lock used in the BlockManager
> --
>
> Key: HDFS-8967
> URL: https://issues.apache.org/jira/browse/HDFS-8967
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-8967.000.patch, HDFS-8967.001.patch, 
> HDFS-8967.002.patch
>
>
> This jira proposes to create a {{BlockManagerLock}} class to represent the 
> lock used in {{BlockManager}}.
> Currently it directly points to the {{FSNamesystem}} lock thus there are no 
> functionality changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis

2015-09-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9086:
--
Attachment: hdfs-9086-hdfs-7285.001.patch

Patch attached doing this rename

> Rename dfs.datanode.stripedread.threshold.millis to 
> dfs.datanode.stripedread.timeout.millis
> ---
>
> Key: HDFS-9086
> URL: https://issues.apache.org/jira/browse/HDFS-9086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7285
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Trivial
> Attachments: hdfs-9086-hdfs-7285.001.patch
>
>
> This config key is used to control the timeout for ECWorker reads, let's name 
> it with the standard term "timeout" rather than "threshold".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Status: Open  (was: Patch Available)

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, 
> HDFS-5802.003.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy

2015-09-17 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9097:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7285
   Status: Resolved  (was: Patch Available)

Thanks Andrew for reviewing! The test failures are unrelated. 
{{TestWebHDFSOAuth2}} pass locally (seems to be a library loading issue). 
{{testWriteStripedFileWithDNFailure}} is flaky in the branch nightly Jenkins 
and we should fix it (as a new subtask).

The findbug issues are pre-existing as well, being addressed in HDFS-8550.

> Erasure coding: update EC command "-s" flag to "-p" when specifying policy
> --
>
> Key: HDFS-9097
> URL: https://issues.apache.org/jira/browse/HDFS-9097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: HDFS-7285
>
> Attachments: HDFS-9097-HDFS-7285.00.patch
>
>
> HDFS-8833 missed this update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7037) Using distcp to copy data from insecure to secure cluster via hftp doesn't work (branch-2 only)

2015-09-17 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804594#comment-14804594
 ] 

Haohui Mai commented on HDFS-7037:
--

bq. adding this capability to HFTP does not change the security semantics of 
Hadoop at all, since RPC and other interfaces used for remote access already 
support allowing configurable insecure fallback

Please correct me if I misunderstood. (1) The current behavior of RPC / WebHDFS 
is less than ideal and it is vulnerable to attack. (2) You argue that the 
proposed changes makes HFTP vulnerable for the fallback, but it is no worse 
than what we have in RPC / WebHDFS today.

As an analogy, it seems to me that the argument is that it's okay to have a 
broken window given that we have many broken windows already?

My question is that is there a need to create yet another workaround, given 
that we know that it is prone for security vulnerability? I'd like to 
understand your use cases better? Can you please elaborate why you'll need 
another workaround in HFTP, given that you guys have put the workaround in 
WebHDFS already?


> Using distcp to copy data from insecure to secure cluster via hftp doesn't 
> work  (branch-2 only)
> 
>
> Key: HDFS-7037
> URL: https://issues.apache.org/jira/browse/HDFS-7037
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security, tools
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7037.001.patch
>
>
> This is a branch-2 only issue since hftp is only supported there. 
> Issuing "distcp hftp:// hdfs://" gave the 
> following failure exception:
> {code}
> 14/09/13 22:07:40 INFO tools.DelegationTokenFetcher: Error when dealing 
> remote token:
> java.io.IOException: Error when dealing remote token: Internal Server Error
>   at 
> org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.run(DelegationTokenFetcher.java:375)
>   at 
> org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.getDTfromRemote(DelegationTokenFetcher.java:238)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:252)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:247)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem.getDelegationToken(HftpFileSystem.java:247)
>   at 
> org.apache.hadoop.hdfs.web.TokenAspect.ensureTokenInitialized(TokenAspect.java:140)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem.addDelegationTokenParam(HftpFileSystem.java:337)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem.openConnection(HftpFileSystem.java:324)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:457)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:472)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem.getFileStatus(HftpFileSystem.java:501)
>   at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
>   at org.apache.hadoop.fs.Globber.glob(Globber.java:248)
>   at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623)
>   at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:77)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:81)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:342)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:154)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:390)
> 14/09/13 22:07:40 WARN security.UserGroupInformation: 
> PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
> cause:java.io.IOException: Unable to obtain remote token
> 14/09/13 22:07:40 ERROR tools.DistCp: Exception encountered 
> java.io.IOException: Unable to obtain remote token
>   at 
> org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.getDTfromRemote(DelegationTokenFetcher.java:249)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:252)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:247)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>   at 
> 

[jira] [Commented] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804604#comment-14804604
 ] 

Hadoop QA commented on HDFS-9086:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12757162/hdfs-9086-hdfs-7285.001.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / ee4ee6a |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12520/console |


This message was automatically generated.

> Rename dfs.datanode.stripedread.threshold.millis to 
> dfs.datanode.stripedread.timeout.millis
> ---
>
> Key: HDFS-9086
> URL: https://issues.apache.org/jira/browse/HDFS-9086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7285
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Trivial
> Attachments: hdfs-9086-hdfs-7285.001.patch
>
>
> This config key is used to control the timeout for ECWorker reads, let's name 
> it with the standard term "timeout" rather than "threshold".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Attachment: HDFS-5802.004.patch

Upload patch 004 to fix checkstyle warnings.
The whitespace error is not from my changes, leave it for now.

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, 
> HDFS-5802.003.patch, HDFS-5802.004.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7037) Using distcp to copy data from insecure to secure cluster via hftp doesn't work (branch-2 only)

2015-09-17 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804611#comment-14804611
 ] 

Aaron T. Myers commented on HDFS-7037:
--

bq. Please correct me if I misunderstood. (1) The current behavior of RPC / 
WebHDFS is less than ideal and it is vulnerable to attack. (2) You argue that 
the proposed changes makes HFTP vulnerable for the fallback, but it is no worse 
than what we have in RPC / WebHDFS today.

Correct.

bq. As an analogy, it seems to me that the argument is that it's okay to have a 
broken window given that we have many broken windows already?

I don't think that's a reasonable analogy. The point you were making is that 
this change introduces a possible security vulnerability. I'm saying that this 
is demonstrably not a security vulnerability, since we consciously chose to add 
this capability to other interfaces. HADOOP-11701 will make things configurably 
more secure for all interfaces, but that's a separate discussion.

bq. My question is that is there a need to create yet another workaround, given 
that we know that it is prone for security vulnerability? 

Like I said above, this should not be considered a security vulnerability. If 
it is, then we should have never added this capability to WebHDFS/RPC, and we 
should be reverting it from WebHDFS/RPC right now.

bq. I'd like to understand your use cases better? Can you please elaborate why 
you'll need another workaround in HFTP, given that you guys have put the 
workaround in WebHDFS already?

Simple: because some users use HFTP and not WebHDFS, specifically for distcp 
from older clusters.

> Using distcp to copy data from insecure to secure cluster via hftp doesn't 
> work  (branch-2 only)
> 
>
> Key: HDFS-7037
> URL: https://issues.apache.org/jira/browse/HDFS-7037
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security, tools
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7037.001.patch
>
>
> This is a branch-2 only issue since hftp is only supported there. 
> Issuing "distcp hftp:// hdfs://" gave the 
> following failure exception:
> {code}
> 14/09/13 22:07:40 INFO tools.DelegationTokenFetcher: Error when dealing 
> remote token:
> java.io.IOException: Error when dealing remote token: Internal Server Error
>   at 
> org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.run(DelegationTokenFetcher.java:375)
>   at 
> org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.getDTfromRemote(DelegationTokenFetcher.java:238)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:252)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:247)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem.getDelegationToken(HftpFileSystem.java:247)
>   at 
> org.apache.hadoop.hdfs.web.TokenAspect.ensureTokenInitialized(TokenAspect.java:140)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem.addDelegationTokenParam(HftpFileSystem.java:337)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem.openConnection(HftpFileSystem.java:324)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:457)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:472)
>   at 
> org.apache.hadoop.hdfs.web.HftpFileSystem.getFileStatus(HftpFileSystem.java:501)
>   at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
>   at org.apache.hadoop.fs.Globber.glob(Globber.java:248)
>   at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623)
>   at 
> org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:77)
>   at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:81)
>   at 
> org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:342)
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:154)
>   at org.apache.hadoop.tools.DistCp.run(DistCp.java:121)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.hadoop.tools.DistCp.main(DistCp.java:390)
> 14/09/13 22:07:40 WARN security.UserGroupInformation: 
> PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) 
> cause:java.io.IOException: Unable to obtain remote token
> 14/09/13 22:07:40 ERROR tools.DistCp: Exception encountered 
> java.io.IOException: Unable to obtain remote token
>   at 
> 

[jira] [Updated] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis

2015-09-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9086:
--
Attachment: HDFS-9086-HDFS-7285.001.patch

Thanks for reviewing Zhe, attaching same patch with capitalized name, let's see 
if Jenkins takes it :)

> Rename dfs.datanode.stripedread.threshold.millis to 
> dfs.datanode.stripedread.timeout.millis
> ---
>
> Key: HDFS-9086
> URL: https://issues.apache.org/jira/browse/HDFS-9086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7285
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Trivial
> Attachments: HDFS-9086-HDFS-7285.001.patch, 
> hdfs-9086-hdfs-7285.001.patch
>
>
> This config key is used to control the timeout for ECWorker reads, let's name 
> it with the standard term "timeout" rather than "threshold".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804495#comment-14804495
 ] 

Hadoop QA commented on HDFS-5802:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 19s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  3s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 14s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 21s | The applied patch generated  3 
new checkstyle issues (total was 36, now 39). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 16s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 163m 49s | Tests passed in hadoop-hdfs. 
|
| | | 209m 58s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12757115/HDFS-5802.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 58d1a02 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12512/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12512/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12512/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12512/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12512/console |


This message was automatically generated.

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, 
> HDFS-5802.003.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9099) Move DistributedFileSystem to hadoop-hdfs-client

2015-09-17 Thread Mingliang Liu (JIRA)
Mingliang Liu created HDFS-9099:
---

 Summary: Move DistributedFileSystem to hadoop-hdfs-client
 Key: HDFS-9099
 URL: https://issues.apache.org/jira/browse/HDFS-9099
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Mingliang Liu
Assignee: Mingliang Liu


This jira tracks efforts of moving 
{{org.apache.hadoop.hdfs.DistributedFileSystem}} class from {{hadoop-hdfs}} to 
{{hadoop-hdfs-client}} module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8873) throttle directoryScanner

2015-09-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804503#comment-14804503
 ] 

Colin Patrick McCabe commented on HDFS-8873:


The jenkins errors look like:
{code}
java.lang.NoSuchMethodError: 
org.apache.hadoop.hdfs.protocol.DatanodeInfo.(Lorg/apache/hadoop/hdfs/protocol/DatanodeID;Ljava/lang/String;ILorg/apache/hadoop/hdfs/protocol/DatanodeInfo$AdminStates;)V
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:591)
{code}

We've seen this before and never managed to track it down.  It seems to be a 
bug in our Jenkins integration, possibly related to having multiple maven 
invocations going on at once sharing the same .m2 directory.  I will re-trigger 
the build.

bq. The whitespace error is interesting. I changed line n in the patch. Jenkins 
complained about the whitespace on line n+1. I fixed the whitespace on line n+1 
in the next patch. Jenkins is now complaining about the whitespace on line n+2

I would say just leave it alone.  If you didn't introduce the whitespace issue 
then don't worry about it.  We really should turn off most of those checkstyle  
things since it provides no value.

> throttle directoryScanner
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, 
> HDFS-8873.003.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9080) update htrace version to 4.0

2015-09-17 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9080:
---
Attachment: HDFS-9080.004.patch

> update htrace version to 4.0
> 
>
> Key: HDFS-9080
> URL: https://issues.apache.org/jira/browse/HDFS-9080
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9080.001.patch, HDFS-9080.002.patch, 
> HDFS-9080.003.patch, HDFS-9080.004.patch
>
>
> Update the HTrace library version Hadoop uses to htrace 4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8696) Reduce the variances of latency of WebHDFS

2015-09-17 Thread Bob Hansen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804558#comment-14804558
 ] 

Bob Hansen commented on HDFS-8696:
--

Testing the latest patch as part of a full Hadoop build (rather than just a set 
of patched jars over an older Hadoop build) shows much less variance.  After a 
warm-up period, we had >500k short requests < 1000ms and 0 at >= 1000ms.

Let's call this a tentative success while we continue testing.

I've reviewed the code.  We can probably drop the default nio thread count down 
from 100 threads to the number of CPUs at a maximum.  Other than that, +1.

> Reduce the variances of latency of WebHDFS
> --
>
> Key: HDFS-8696
> URL: https://issues.apache.org/jira/browse/HDFS-8696
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8696.1.patch, HDFS-8696.2.patch, HDFS-8696.3.patch
>
>
> There is an issue that appears related to the webhdfs server. When making two 
> concurrent requests, the DN will sometimes pause for extended periods (I've 
> seen 1-300 seconds), killing performance and dropping connections. 
> To reproduce: 
> 1. set up a HDFS cluster
> 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
> the time out to /tmp/times.txt
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null 
> "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root=1";
> done
> {noformat}
> 3. Watch for 1-byte requests that take more than one second:
> tail -F /tmp/times.txt | grep -E "^[^0]"
> 4. After it has had a chance to warm up, start doing large transfers from
> another shell:
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e curl -s -L -o /dev/null 
> "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root";
> done
> {noformat}
> It's easy to find after a minute or two that small reads will sometimes
> pause for 1-300 seconds. In some extreme cases, it appears that the
> transfers timeout and the DN drops the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8550) Erasure Coding: Fix FindBugs Multithreaded correctness Warning

2015-09-17 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804556#comment-14804556
 ] 

Zhe Zhang commented on HDFS-8550:
-

Thanks Rakesh! The fixes look good except for the following questions / 
suggestions:
{code}
-  if (b.isStriped()) {
+  if (b.isStriped() && b instanceof LocatedStripedBlock) {
{code}

Better be:
{code}
if (b.isStriped()) {
  Preconditions.checkState(b instanceof LocatedStripedBlock);
}
{code}

{{int bufOffset = (int) (rangeStartInBlockGroup % ((long) cellSize * 
dataBlkNum));}}: should it be {{(long)(cellSize * dataBlkNum)}}?

{{synchronized (DFSStripedInputStream.this)}} maybe {{synchronized 
(curStripeBuf)}} is more explicit?

> Erasure Coding: Fix FindBugs Multithreaded correctness Warning
> --
>
> Key: HDFS-8550
> URL: https://issues.apache.org/jira/browse/HDFS-8550
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8550-HDFS-7285-00.patch, 
> HDFS-8550-HDFS-7285-01.patch
>
>
> Please find the findbug warnings 
> [report|https://builds.apache.org/job/PreCommit-HDFS-Build/12444/artifact/patchprocess/patchFindbugsWarningshadoop-hdfs.html]
> 1) {code}
> Bug type IS2_INCONSISTENT_SYNC (click for details) 
> In class org.apache.hadoop.hdfs.DFSStripedInputStream
> Field org.apache.hadoop.hdfs.DFSStripedInputStream.curStripeBuf
> Synchronized 90% of the time
> Unsynchronized access at DFSStripedInputStream.java:[line 829]
> Synchronized access at DFSStripedInputStream.java:[line 183]
> Synchronized access at DFSStripedInputStream.java:[line 186]
> Synchronized access at DFSStripedInputStream.java:[line 184]
> Synchronized access at DFSStripedInputStream.java:[line 382]
> Synchronized access at DFSStripedInputStream.java:[line 460]
> Synchronized access at DFSStripedInputStream.java:[line 461]
> Synchronized access at DFSStripedInputStream.java:[line 461]
> Synchronized access at DFSStripedInputStream.java:[line 285]
> Synchronized access at DFSStripedInputStream.java:[line 297]
> Synchronized access at DFSStripedInputStream.java:[line 298]
> {code}
> 2) 
> {code}
> Unread field: 
> org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock
> Bug type URF_UNREAD_FIELD (click for details) 
> In class org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo
> Field org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock
> At DFSStripedInputStream.java:[line 126]
> {code}
> 3) 
> {code}
> Unchecked/unconfirmed cast from org.apache.hadoop.hdfs.protocol.LocatedBlock 
> to org.apache.hadoop.hdfs.protocol.LocatedStripedBlock in 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.setBlockToken(LocatedBlock,
>  BlockTokenIdentifier$AccessMode)
> Bug type BC_UNCONFIRMED_CAST (click for details) 
> In class org.apache.hadoop.hdfs.server.blockmanagement.BlockManager
> In method 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.setBlockToken(LocatedBlock,
>  BlockTokenIdentifier$AccessMode)
> Actual type org.apache.hadoop.hdfs.protocol.LocatedBlock
> Expected org.apache.hadoop.hdfs.protocol.LocatedStripedBlock
> Value loaded from b
> At BlockManager.java:[line 974]
> {code}
> 4) 
> {code}
> Result of integer multiplication cast to long in 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(ErasureCodingPolicy,
>  int, LocatedStripedBlock, long, long, ByteBuffer)
> Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) 
> In class org.apache.hadoop.hdfs.util.StripedBlockUtil
> In method 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(ErasureCodingPolicy,
>  int, LocatedStripedBlock, long, long, ByteBuffer)
> At StripedBlockUtil.java:[line 375]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9004) Add upgrade domain to DatanodeInfo

2015-09-17 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804634#comment-14804634
 ] 

Lei (Eddy) Xu commented on HDFS-9004:
-

+1 pending jenkins.

> Add upgrade domain to DatanodeInfo
> --
>
> Key: HDFS-9004
> URL: https://issues.apache.org/jira/browse/HDFS-9004
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9004-2.patch, HDFS-9004-3.patch, HDFS-9004.patch
>
>
> As part of upgrade domain feature, we first need to add upgrade domain string 
> to {{DatanodeInfo}}. It includes things like:
> * Add a new field to DatanodeInfo.
> * Modify protobuf for DatanodeInfo.
> * Update DatanodeInfo.getDatanodeReport to include upgrade domain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client

2015-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804651#comment-14804651
 ] 

Hudson commented on HDFS-9022:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2350 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2350/])
HDFS-9022. Move NameNode.getAddress() and NameNode.getUri() to 
hadoop-hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 
9eee97508f350ed4629abb04e7781514ffa04070)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeRollingUpgrade.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShellGenericOptions.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDefaultNameNodePort.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DfsServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/IPFailoverProxyProvider.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientFailover.java


> Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
> --
>
> Key: HDFS-9022
> URL: https://issues.apache.org/jira/browse/HDFS-9022
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, 
> HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch
>
>
> The static helper methods in NameNodes are used in {{hdfs-client}} module. 
> For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes 
> which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should 
> keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module.
> This jira tracks the effort of moving the following static helper methods out 
> of  {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these 
> methods is the {{DFSUtilClient}} class:
> {code}
> public static InetSocketAddress getAddress(String address);
> public static InetSocketAddress getAddress(Configuration conf);
> public static InetSocketAddress getAddress(URI filesystemURI);
> public static URI getUri(InetSocketAddress namenode);
> {code}
> Be cautious not to bring new checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client

2015-09-17 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9022:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the 
contribution.

> Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
> --
>
> Key: HDFS-9022
> URL: https://issues.apache.org/jira/browse/HDFS-9022
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, 
> HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch
>
>
> The static helper methods in NameNodes are used in {{hdfs-client}} module. 
> For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes 
> which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should 
> keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module.
> This jira tracks the effort of moving the following static helper methods out 
> of  {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these 
> methods is the {{DFSUtilClient}} class:
> {code}
> public static InetSocketAddress getAddress(String address);
> public static InetSocketAddress getAddress(Configuration conf);
> public static InetSocketAddress getAddress(URI filesystemURI);
> public static URI getUri(InetSocketAddress namenode);
> {code}
> Be cautious not to bring new checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client

2015-09-17 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9022:
-
Component/s: (was: namenode)

> Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
> --
>
> Key: HDFS-9022
> URL: https://issues.apache.org/jira/browse/HDFS-9022
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, 
> HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch
>
>
> The static helper methods in NameNodes are used in {{hdfs-client}} module. 
> For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes 
> which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should 
> keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module.
> This jira tracks the effort of moving the following static helper methods out 
> of  {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these 
> methods is the {{DFSUtilClient}} class:
> {code}
> public static InetSocketAddress getAddress(String address);
> public static InetSocketAddress getAddress(Configuration conf);
> public static InetSocketAddress getAddress(URI filesystemURI);
> public static URI getUri(InetSocketAddress namenode);
> {code}
> Be cautious not to bring new checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis

2015-09-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-9086:
--
Status: Patch Available  (was: Open)

> Rename dfs.datanode.stripedread.threshold.millis to 
> dfs.datanode.stripedread.timeout.millis
> ---
>
> Key: HDFS-9086
> URL: https://issues.apache.org/jira/browse/HDFS-9086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7285
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Trivial
> Attachments: hdfs-9086-hdfs-7285.001.patch
>
>
> This config key is used to control the timeout for ECWorker reads, let's name 
> it with the standard term "timeout" rather than "threshold".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804539#comment-14804539
 ] 

Hadoop QA commented on HDFS-9097:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 53s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 52s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 43s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 14s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | site |   2m 57s | Site still builds. |
| {color:green}+1{color} | checkstyle |   0m 31s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 30s | The patch appears to introduce 4 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  5s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 186m 30s | Tests failed in hadoop-hdfs. |
| | | 234m 25s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.web.TestWebHDFSOAuth2 |
|   | hadoop.hdfs.TestWriteStripedFileWithFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12757113/HDFS-9097-HDFS-7285.00.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | HDFS-7285 / e36129b |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12513/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12513/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12513/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12513/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12513/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12513/console |


This message was automatically generated.

> Erasure coding: update EC command "-s" flag to "-p" when specifying policy
> --
>
> Key: HDFS-9097
> URL: https://issues.apache.org/jira/browse/HDFS-9097
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9097-HDFS-7285.00.patch
>
>
> HDFS-8833 missed this update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-09-17 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804547#comment-14804547
 ] 

Owen O'Malley commented on HDFS-8855:
-

A few points:
* You need to use the Token.getKind(), Token.getIdentifier(), and 
Token.getPassword() as the key for the cache. The patch currently uses 
Token.toString, which uses the identifier, kind, and service. The service is 
set by the client so it shouldn't be part of the match. The password on the 
other hand must be part of the match so that guessing the identifier doesn't 
allow a hacker to impersonate the user.
* The timeout should default to 10 minutes instead of 10 seconds.
* Please fix the checkstyle and findbugs warnings.
* Determine what is wrong with the test case.

Other than that, it looks good.

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, 
> HDFS-8855.4.patch, HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client

2015-09-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804564#comment-14804564
 ] 

Hudson commented on HDFS-9022:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8472 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8472/])
HDFS-9022. Move NameNode.getAddress() and NameNode.getUri() to 
hadoop-hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 
9eee97508f350ed4629abb04e7781514ffa04070)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDefaultNameNodePort.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShellGenericOptions.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DfsServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientFailover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeRollingUpgrade.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/IPFailoverProxyProvider.java


> Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
> --
>
> Key: HDFS-9022
> URL: https://issues.apache.org/jira/browse/HDFS-9022
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, 
> HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch
>
>
> The static helper methods in NameNodes are used in {{hdfs-client}} module. 
> For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes 
> which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should 
> keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module.
> This jira tracks the effort of moving the following static helper methods out 
> of  {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these 
> methods is the {{DFSUtilClient}} class:
> {code}
> public static InetSocketAddress getAddress(String address);
> public static InetSocketAddress getAddress(Configuration conf);
> public static InetSocketAddress getAddress(URI filesystemURI);
> public static URI getUri(InetSocketAddress namenode);
> {code}
> Be cautious not to bring new checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis

2015-09-17 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804608#comment-14804608
 ] 

Zhe Zhang commented on HDFS-9086:
-

Thanks Andrew, the patch LGTM. I think Jenkins tried applying it on trunk, 
maybe because of capitalization in patch name.

> Rename dfs.datanode.stripedread.threshold.millis to 
> dfs.datanode.stripedread.timeout.millis
> ---
>
> Key: HDFS-9086
> URL: https://issues.apache.org/jira/browse/HDFS-9086
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7285
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Trivial
> Attachments: hdfs-9086-hdfs-7285.001.patch
>
>
> This config key is used to control the timeout for ECWorker reads, let's name 
> it with the standard term "timeout" rather than "threshold".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9004) Add upgrade domain to DatanodeInfo

2015-09-17 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-9004:
--
Attachment: HDFS-9004-3.patch

Thanks [~eddyxu]! Here is the updated patch that addresses your comment. The 
failed unit tests aren't related.

> Add upgrade domain to DatanodeInfo
> --
>
> Key: HDFS-9004
> URL: https://issues.apache.org/jira/browse/HDFS-9004
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9004-2.patch, HDFS-9004-3.patch, HDFS-9004.patch
>
>
> As part of upgrade domain feature, we first need to add upgrade domain string 
> to {{DatanodeInfo}}. It includes things like:
> * Add a new field to DatanodeInfo.
> * Modify protobuf for DatanodeInfo.
> * Update DatanodeInfo.getDatanodeReport to include upgrade domain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path

2015-09-17 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-5802:

Status: Patch Available  (was: Open)

> NameNode does not check for inode type before traversing down a path
> 
>
> Key: HDFS-5802
> URL: https://issues.apache.org/jira/browse/HDFS-5802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Xiao Chen
>Priority: Trivial
> Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, 
> HDFS-5802.003.patch, HDFS-5802.004.patch
>
>
> This came up during the discussion on a forum at 
> http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162
>  surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is 
> a file and not a directory.
> In such a case, NameNode yields a user-confusing message of {{Permission 
> denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead 
> of clearly saying (and realising) "/foo is not a directory" or "/foo is a 
> file" before it tries to traverse further down to locate the requested path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9092) Nfs silently drops overlapping write requests, thus data copying can't complete

2015-09-17 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804667#comment-14804667
 ] 

Brandon Li commented on HDFS-9092:
--

Thank you, [~yzhangal] for the patch. Could you roughly describe the idea of 
the fix?

> Nfs silently drops overlapping write requests, thus data copying can't 
> complete
> ---
>
> Key: HDFS-9092
> URL: https://issues.apache.org/jira/browse/HDFS-9092
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.7.1
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-9092.001.patch
>
>
> When NOT using 'sync' option, the NFS writes may issue the following warning:
> org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write 
> (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now
> and the size of data copied via NFS will stay at 1248752400.
> Found what happened is:
> 1. The write requests from client are sent asynchronously. 
> 2. The NFS gateway has handler to handle the incoming requests by creating an 
> internal write request structuire and put it into cache;
> 3. In parallel, a separate thread in NFS gateway takes requests out from the 
> cache and writes the data to HDFS.
> The current offset is how much data has been written by the write thread in 
> 3. The detection of overlapping write request happens in 2, but it only 
> checks the write request against the curent offset, and trim the request if 
> necessary. Because the write requests are sent asynchronously, if two 
> requests are beyond the current offset, and they overlap, it's not detected 
> and both are put into the cache. This cause the symptom reported in this case 
> at step 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9040) Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests to Coordinator)

2015-09-17 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803339#comment-14803339
 ] 

Jing Zhao commented on HDFS-9040:
-

I think the key point here is to bump the GS, which is necessary to identify 
stale/corrupted internal blocks. For example, when writing the last stripe, 
suppose the last data block fails. Only based on internal block lengths we 
cannot identify the failure. Later when we suppose hflush/hsync, we have to use 
(GS+block group size) to identify the correct parity blocks.

But maybe the NN does not need a strong correctness guarantee for the expected 
replica list. Block location information can be corrected finally based on 
full/incremental block reports. 

> Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests 
> to Coordinator)
> ---
>
> Key: HDFS-9040
> URL: https://issues.apache.org/jira/browse/HDFS-9040
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Walter Su
> Attachments: HDFS-9040-HDFS-7285.002.patch, 
> HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, 
> HDFS-9040.02.bgstreamer.patch
>
>
> The general idea is to simplify error handling logic.
> Proposal 1:
> A BlockGroupDataStreamer to communicate with NN to allocate/update block, and 
> StripedDataStreamer s only have to stream blocks to DNs.
> Proposal 2:
> See below the 
> [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388]
>  from [~jingzhao].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >