[jira] [Commented] (HDFS-9063) Correctly handle snapshot path for getContentSummary
[ https://issues.apache.org/jira/browse/HDFS-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791678#comment-14791678 ] Yi Liu commented on HDFS-9063: -- Thanks [~jingzhao] for working on this. Jenkins' report has issue, {{TestGetContentSummaryWithSnapshot}} can pass locally. I also found similar issue (not the same, but also problem of getContentSummary if there is snapshot) when writing tests for large directory in HDFS-9053, it also exists in current trunk. I apply your patch, the issue I saw is still there. I think you can fix the issue too and write test of following steps to reproduce the issue I found (Of course if you don't want to fix it, I can do it separately :)): # Suppose we have a directory named 'dir', create 16 files in the dir # remove the last 1 file -- now total 15 files in dir # create a snapshot 's1' of dir # add 1 file in dir -- now total 16 files in dir # remove the first 1 file in dir -- now total 15 files in the dir # call getContentSummary(dir), and then {{getFileCount}}. -- the expected result is 15, but the return is 16. > Correctly handle snapshot path for getContentSummary > > > Key: HDFS-9063 > URL: https://issues.apache.org/jira/browse/HDFS-9063 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9063.000.patch > > > The current getContentSummary implementation does not take into account the > snapshot path, thus if we have the following ops: > 1. create dirs /foo/bar > 2. take snapshot s1 on /foo > 3. create a 1 byte file /foo/bar/baz > then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", > which is incorrect since the 1 byte file is not included in snapshot s1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"
[ https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-9089: -- Hadoop Flags: Incompatible change This looks like an incompatible change. > Balancer and Mover should use ".system" as reserved inode name instead of > "system" > -- > > Key: HDFS-9089 > URL: https://issues.apache.org/jira/browse/HDFS-9089 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Reporter: Archana T >Assignee: Surendra Singh Lilhore > Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch > > > Currently Balancer and Mover create "/system" for placing mover.id and > balancer.id > hdfs dfs -ls / > drwxr-xr-x - root hadoop 0 2015-09-16 12:49 > {color:red}/system{color} > This folder created in not deleted once mover or balancer work is completed > So user cannot create dir "system" . > Its better to make ".system" as reserved inode for balancer and mover instead > of "system". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9040) Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests to Coordinator)
[ https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791655#comment-14791655 ] Zhe Zhang commented on HDFS-9040: - Thanks Jing for the new patch! The structure looks much cleaner now. I've been thinking about the design to check streamer failures at {{writeChunk}} and other events on {{OutputStream}} level. The code structure is certainly simpler than handling failures on streamer level. But are there any disadvantages to delay the handling of a streamer failure? If there isn't any downside, should we just do {{updatePipeline}} when completing the block? A few possible disadvantages I can think of: # In the read-being-written scenario, there will be a longer window of *false-fresh" (meaning a stale internal block is considered as fresh). # When {{NUM_PARITY_BLOCKS}} number of streamers are dead, the {{OutputStream}} should die immediately instead of waiting for the next {{writeChunk}}. # We might want to add the logic to replace a failed {{StripedDataStreamer}} in the future. Delayed error handling will cause delayed streamer replacement. > Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests > to Coordinator) > --- > > Key: HDFS-9040 > URL: https://issues.apache.org/jira/browse/HDFS-9040 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su > Attachments: HDFS-9040-HDFS-7285.002.patch, > HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, > HDFS-9040.02.bgstreamer.patch > > > The general idea is to simplify error handling logic. > Proposal 1: > A BlockGroupDataStreamer to communicate with NN to allocate/update block, and > StripedDataStreamer s only have to stream blocks to DNs. > Proposal 2: > See below the > [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388] > from [~jingzhao]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791677#comment-14791677 ] Mingliang Liu commented on HDFS-9022: - The only failing test {{TestReplaceDatanodeOnFailure}} fails occasionally as a known bug, see [HDFS-6101]. Timeout tests can not reproduced in my local Mac. New javac warning in TestMRCredentials.java: {quote}getUri(InetSocketAddress) in NameNode has been deprecated{quote} This is expected as we will file a new jira to replace the {{NameNode.getUri()}} with {{DFSUtilClient.getNNUri()}}. See [comments above | https://issues.apache.org/jira/browse/HDFS-9022?focusedCommentId=14791104=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14791104] > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, > HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning external
[ https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore updated HDFS-8341: - Assignee: (was: Surendra Singh Lilhore) > (Summary & Description may be invalid) HDFS mover stuck in loop after failing > to move block, doesn't move rest of blocks, can't get data back off > decommissioning external storage tier as a result > --- > > Key: HDFS-8341 > URL: https://issues.apache.org/jira/browse/HDFS-8341 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.6.0 > Environment: HDP 2.2 >Reporter: Hari Sekhon >Priority: Minor > > HDFS mover gets stuck looping on a block that fails to move and doesn't > migrate the rest of the blocks. > This is preventing recovery of data from a decomissioning external storage > tier used for archive (we've had problems with that proprietary "hyperscale" > storage product which is why a couple blocks here and there have checksum > problems or premature eof as shown below), but this should not prevent moving > all the other blocks to recover our data: > {code}hdfs mover -p /apps/hive/warehouse/ > 15/05/07 14:52:50 INFO mover.Mover: namenodes = > {hdfs://nameservice1=[/apps/hive/warehouse/]} > 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > .. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9092) Nfs silently drops overlapping write requests, thus data copying can't complete
[ https://issues.apache.org/jira/browse/HDFS-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791684#comment-14791684 ] Hadoop QA commented on HDFS-9092: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 21s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 1s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 58s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 15s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 22s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 18 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 0m 53s | The patch appears to introduce 2 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 15s | Pre-build of native portion | | {color:green}+1{color} | hdfs tests | 1m 46s | Tests passed in hadoop-hdfs-nfs. | | | | 43m 23s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-nfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756410/HDFS-9092.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 0832b38 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12502/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12502/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html | | hadoop-hdfs-nfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12502/artifact/patchprocess/testrun_hadoop-hdfs-nfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12502/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12502/console | This message was automatically generated. > Nfs silently drops overlapping write requests, thus data copying can't > complete > --- > > Key: HDFS-9092 > URL: https://issues.apache.org/jira/browse/HDFS-9092 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-9092.001.patch > > > When NOT using 'sync' option, the NFS writes may issue the following warning: > org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write > (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now > and the size of data copied via NFS will stay at 1248752400. > Found what happened is: > 1. The write requests from client are sent asynchronously. > 2. The NFS gateway has handler to handle the incoming requests by creating an > internal write request structuire and put it into cache; > 3. In parallel, a separate thread in NFS gateway takes requests out from the > cache and writes the data to HDFS. > The current offset is how much data has been written by the write thread in > 3. The detection of overlapping write request happens in 2, but it only > checks the write request against the curent offset, and trim the request if > necessary. Because the write requests are sent asynchronously, if two > requests are beyond the current offset, and they overlap, it's not detected > and both are put into the cache. This cause the symptom reported in this case > at step 3. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"
[ https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791692#comment-14791692 ] Hadoop QA commented on HDFS-9089: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 23s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 58s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 22s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 22s | The applied patch generated 1 new checkstyle issues (total was 59, now 60). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 6s | Post-patch findbugs hadoop-hdfs-project/hadoop-hdfs compilation is broken. | | {color:green}+1{color} | findbugs | 2m 6s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 0m 24s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 0m 21s | Tests failed in hadoop-hdfs. | | | | 43m 27s | | \\ \\ || Reason || Tests || | Failed build | hadoop-hdfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756411/HDFS-9089.02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 0832b38 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12504/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12504/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12504/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12504/console | This message was automatically generated. > Balancer and Mover should use ".system" as reserved inode name instead of > "system" > -- > > Key: HDFS-9089 > URL: https://issues.apache.org/jira/browse/HDFS-9089 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Reporter: Archana T >Assignee: Surendra Singh Lilhore > Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch > > > Currently Balancer and Mover create "/system" for placing mover.id and > balancer.id > hdfs dfs -ls / > drwxr-xr-x - root hadoop 0 2015-09-16 12:49 > {color:red}/system{color} > This folder created in not deleted once mover or balancer work is completed > So user cannot create dir "system" . > Its better to make ".system" as reserved inode for balancer and mover instead > of "system". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9040) Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests to Coordinator)
[ https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791739#comment-14791739 ] Walter Su commented on HDFS-9040: - bq. should we just do updatePipeline when completing the block? 1. In the read-being-written scenario, there will be a longer window of *false-fresh" (meaning a stale internal block is considered as fresh). We should do it before hflush/hsync as well. bq. 2. When NUM_PARITY_BLOCKS number of streamers are dead, the OutputStream should die immediately instead of waiting for the next writeChunk. failed streamer is detected in writeChunk. We plan to add periodical checking. [~jingzhao] said that before. bq. 3. We might want to add the logic to replace a failed StripedDataStreamer in the future. No, we won't. I think so? if you're talking something like Datanode replacement for repl block. You can transfer a healthy repl RBW to a new Datanode, then you still get 3 DNs after replacement. But recover a corrupted RBW internal block is difficult. I've a question. Instead of delay, Do we even need refresh UC.replicas? 1. A client read UC block being written can decode replica if it misses some part. ( With checksum verification, we are only concern about 'missing') 2. Block recovery/ lease recovery truncates all RBW's length to minimal length for repl block. For striping, Assume a corrupted internalBlock has a small length ,like 200kb. 8 healthy internalBlocks have long length, like (1mb-cellSize, 1mb+cellSize). Of course after recovery we should truncate the 8 to 1mb ( 8 healthy internal blocks should be at the same last stripe, but should we truncate last stripe? That's not my point.). My point is , we can rule out the corrupted internalBlocks by {{commitBlockSynchronization}}. 3. Maintenance the indices of UC.replicas. UC.replicas updated by BlockReport is safe, because reportedBlock has ID. If UC.replicas is updated by updatePipeline, the indices are derived from array offset. You can see {{UC.setExpectedLocations()}} It's error prone. If we don't refresh UC.replicas we are pretty safe. > Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests > to Coordinator) > --- > > Key: HDFS-9040 > URL: https://issues.apache.org/jira/browse/HDFS-9040 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su > Attachments: HDFS-9040-HDFS-7285.002.patch, > HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, > HDFS-9040.02.bgstreamer.patch > > > The general idea is to simplify error handling logic. > Proposal 1: > A BlockGroupDataStreamer to communicate with NN to allocate/update block, and > StripedDataStreamer s only have to stream blocks to DNs. > Proposal 2: > See below the > [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388] > from [~jingzhao]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"
[ https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791760#comment-14791760 ] Tsz Wo Nicholas Sze commented on HDFS-9089: --- > Its better to make ".system" as reserved inode for balancer and mover instead > of "system". Why ".system" is better than "system"? The same argument applies -- what if users want to create ".system"? > Balancer and Mover should use ".system" as reserved inode name instead of > "system" > -- > > Key: HDFS-9089 > URL: https://issues.apache.org/jira/browse/HDFS-9089 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Reporter: Archana T >Assignee: Surendra Singh Lilhore > Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch > > > Currently Balancer and Mover create "/system" for placing mover.id and > balancer.id > hdfs dfs -ls / > drwxr-xr-x - root hadoop 0 2015-09-16 12:49 > {color:red}/system{color} > This folder created in not deleted once mover or balancer work is completed > So user cannot create dir "system" . > Its better to make ".system" as reserved inode for balancer and mover instead > of "system". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning external
[ https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-8341. --- Resolution: Invalid Resolving as invalid. Please feel free to reopen if you disagree. > (Summary & Description may be invalid) HDFS mover stuck in loop after failing > to move block, doesn't move rest of blocks, can't get data back off > decommissioning external storage tier as a result > --- > > Key: HDFS-8341 > URL: https://issues.apache.org/jira/browse/HDFS-8341 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.6.0 > Environment: HDP 2.2 >Reporter: Hari Sekhon >Priority: Minor > > HDFS mover gets stuck looping on a block that fails to move and doesn't > migrate the rest of the blocks. > This is preventing recovery of data from a decomissioning external storage > tier used for archive (we've had problems with that proprietary "hyperscale" > storage product which is why a couple blocks here and there have checksum > problems or premature eof as shown below), but this should not prevent moving > all the other blocks to recover our data: > {code}hdfs mover -p /apps/hive/warehouse/ > 15/05/07 14:52:50 INFO mover.Mover: namenodes = > {hdfs://nameservice1=[/apps/hive/warehouse/]} > 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > .. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8550) Erasure Coding: Fix FindBugs Multithreaded correctness Warning
[ https://issues.apache.org/jira/browse/HDFS-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802814#comment-14802814 ] Rakesh R commented on HDFS-8550: Hi [~zhz], Attached patch to resolve the findbug warnings, would be great if you could pitch in and review the patch! > Erasure Coding: Fix FindBugs Multithreaded correctness Warning > -- > > Key: HDFS-8550 > URL: https://issues.apache.org/jira/browse/HDFS-8550 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8550-HDFS-7285-00.patch, > HDFS-8550-HDFS-7285-01.patch > > > Please find the findbug warnings > [report|https://builds.apache.org/job/PreCommit-HDFS-Build/12444/artifact/patchprocess/patchFindbugsWarningshadoop-hdfs.html] > 1) {code} > Bug type IS2_INCONSISTENT_SYNC (click for details) > In class org.apache.hadoop.hdfs.DFSStripedInputStream > Field org.apache.hadoop.hdfs.DFSStripedInputStream.curStripeBuf > Synchronized 90% of the time > Unsynchronized access at DFSStripedInputStream.java:[line 829] > Synchronized access at DFSStripedInputStream.java:[line 183] > Synchronized access at DFSStripedInputStream.java:[line 186] > Synchronized access at DFSStripedInputStream.java:[line 184] > Synchronized access at DFSStripedInputStream.java:[line 382] > Synchronized access at DFSStripedInputStream.java:[line 460] > Synchronized access at DFSStripedInputStream.java:[line 461] > Synchronized access at DFSStripedInputStream.java:[line 461] > Synchronized access at DFSStripedInputStream.java:[line 285] > Synchronized access at DFSStripedInputStream.java:[line 297] > Synchronized access at DFSStripedInputStream.java:[line 298] > {code} > 2) > {code} > Unread field: > org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock > Bug type URF_UNREAD_FIELD (click for details) > In class org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo > Field org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock > At DFSStripedInputStream.java:[line 126] > {code} > 3) > {code} > Unchecked/unconfirmed cast from org.apache.hadoop.hdfs.protocol.LocatedBlock > to org.apache.hadoop.hdfs.protocol.LocatedStripedBlock in > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.setBlockToken(LocatedBlock, > BlockTokenIdentifier$AccessMode) > Bug type BC_UNCONFIRMED_CAST (click for details) > In class org.apache.hadoop.hdfs.server.blockmanagement.BlockManager > In method > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.setBlockToken(LocatedBlock, > BlockTokenIdentifier$AccessMode) > Actual type org.apache.hadoop.hdfs.protocol.LocatedBlock > Expected org.apache.hadoop.hdfs.protocol.LocatedStripedBlock > Value loaded from b > At BlockManager.java:[line 974] > {code} > 4) > {code} > Result of integer multiplication cast to long in > org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(ErasureCodingPolicy, > int, LocatedStripedBlock, long, long, ByteBuffer) > Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) > In class org.apache.hadoop.hdfs.util.StripedBlockUtil > In method > org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(ErasureCodingPolicy, > int, LocatedStripedBlock, long, long, ByteBuffer) > At StripedBlockUtil.java:[line 375] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-9053) Support large directories efficiently using B-Tree
[ https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-9053: - Comment: was deleted (was: \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 54s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 7 new or modified test files. | | {color:red}-1{color} | javac | 7m 59s | The applied patch generated 28 additional warning messages. | | {color:green}+1{color} | javadoc | 10m 9s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 30s | The applied patch generated 16 new checkstyle issues (total was 0, now 16). | | {color:red}-1{color} | whitespace | 0m 8s | The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 22s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 23s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 43m 47s | Tests failed in hadoop-hdfs. | | | | 111m 4s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.net.TestClusterTopology | | | hadoop.hdfs.TestFileStatus | | | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | | | hadoop.hdfs.server.namenode.TestINodeFile | | | hadoop.fs.contract.hdfs.TestHDFSContractOpen | | | hadoop.hdfs.server.datanode.TestFsDatasetCache | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart | | | hadoop.hdfs.server.namenode.ha.TestHASafeMode | | | hadoop.hdfs.TestEncryptionZonesWithHA | | | hadoop.fs.contract.hdfs.TestHDFSContractMkdir | | | hadoop.hdfs.server.namenode.TestNameNodeXAttr | | | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby | | | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.TestDecommission | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.namenode.ha.TestDelegationTokensWithHA | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.blockmanagement.TestNameNodePrunesMissingStorages | | | hadoop.hdfs.server.datanode.TestCachingStrategy | | | hadoop.hdfs.server.namenode.TestFileJournalManager | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.cli.TestXAttrCLI | | | hadoop.hdfs.server.namenode.TestDeleteRace | | | hadoop.hdfs.server.namenode.TestParallelImageWrite | | | hadoop.hdfs.server.namenode.TestNameNodeRespectsBindHostKeys | | | hadoop.hdfs.server.namenode.TestNNStorageRetentionFunctional | | | hadoop.hdfs.server.namenode.TestSaveNamespace | | | hadoop.hdfs.TestDFSRename | | | hadoop.hdfs.util.TestDiff | | | hadoop.hdfs.server.datanode.TestDataNodeFSDataSetSink | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotManager | | | hadoop.hdfs.server.namenode.TestFsck | | | hadoop.hdfs.server.namenode.ha.TestHarFileSystemWithHA | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.server.datanode.TestDeleteBlockPool | | | hadoop.hdfs.TestRemoteBlockReader2 | | | hadoop.hdfs.server.namenode.TestStorageRestore | | | hadoop.hdfs.server.namenode.TestFileLimit | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.fs.contract.hdfs.TestHDFSContractSetTimes | | | hadoop.hdfs.server.namenode.snapshot.TestCheckpointsWithSnapshots | | | hadoop.hdfs.server.namenode.TestFSPermissionChecker | | | hadoop.hdfs.server.namenode.TestSecureNameNode | | | hadoop.hdfs.server.namenode.TestFileContextAcl | | | hadoop.hdfs.server.datanode.TestDataXceiverLazyPersistHint | | | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA | | | hadoop.hdfs.TestDataTransferProtocol | | | hadoop.fs.viewfs.TestViewFsWithXAttrs | | | hadoop.hdfs.server.blockmanagement.TestDatanodeManager | | | hadoop.hdfs.server.datanode.TestDiskError | | | hadoop.hdfs.server.namenode.TestMalformedURLs | | | hadoop.hdfs.TestReadWhileWriting | | | hadoop.fs.TestSWebHdfsFileContextMainOperations | | | hadoop.hdfs.TestIsMethodSupported | | | hadoop.hdfs.TestParallelShortCircuitReadNoChecksum | | | hadoop.hdfs.server.blockmanagement.TestAvailableSpaceBlockPlacementPolicy | | | hadoop.hdfs.TestFileCreationClient | | |
[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree
[ https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802729#comment-14802729 ] Yi Liu commented on HDFS-9053: -- The failed one test is not related to this patch. > Support large directories efficiently using B-Tree > -- > > Key: HDFS-9053 > URL: https://issues.apache.org/jira/browse/HDFS-9053 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yi Liu >Assignee: Yi Liu >Priority: Critical > Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 > (BTree).patch, HDFS-9053.001.patch > > > This is a long standing issue, we were trying to improve this in the past. > Currently we use an ArrayList for the children under a directory, and the > children are ordered in the list, for insert/delete/search, the time > complexity is O(log n), but insertion/deleting causes re-allocations and > copies of big arrays, so the operations are costly. For example, if the > children grow to 1M size, the ArrayList will resize to > 1M capacity, so need > > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS > cluster where namenode heap memory is already highly used. I recap the 3 > main issues: > # Insertion/deletion operations in large directories are expensive because > re-allocations and copies of big arrays. > # Dynamically allocate several MB continuous heap memory which will be > long-lived can easily cause full GC problem. > # Even most children are removed later, but the directory INode still > occupies same size heap memory, since the ArrayList will never shrink. > This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to > solve the problem suggested by [~shv]. > So the target of this JIRA is to implement a low memory footprint B-Tree and > use it to replace ArrayList. > If the elements size is not large (less than the maximum degree of B-Tree > node), the B-Tree only has one root node which contains an array for the > elements. And if the size grows large enough, it will split automatically, > and if elements are removed, then B-Tree nodes can merge automatically (see > more: https://en.wikipedia.org/wiki/B-tree). It will solve the above 3 > issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning externa
[ https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802746#comment-14802746 ] Hari Sekhon commented on HDFS-8341: --- [~szetszwo] I believe this ticket is still valid: There were holes in the data since that storage tier had replication factor 1 as the replication was supposed to be handled within the proprietary hyperscale storage solution underpinning that tier so there was no point in storing multiple HDFS replicas there. So if a given block's checksum failed, HDFS Mover looped on that block (probably hoping to find other valid block replicas to use but there were no other replicas so it was stuck looping on the one corrupt replica) and never got past that block so it didn't transfer the rest of the data's other blocks. This would be the same problem if all replicas were corrupt or if a block was under replicated (which happens often) and the existing replica was corrupt. So this jira is still valid - if HDFS Mover can't find a valid/non-corrupt replica then it doesn't proceed to move the rest of the other blocks, which prevented decommissioning of this storage tier. This is the reason I scripted a custom recovery job under the hood of Hadoop since the other blocks were fine and it was leaving a lot of data behind on the external storage tier. > (Summary & Description may be invalid) HDFS mover stuck in loop after failing > to move block, doesn't move rest of blocks, can't get data back off > decommissioning external storage tier as a result > --- > > Key: HDFS-8341 > URL: https://issues.apache.org/jira/browse/HDFS-8341 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.6.0 > Environment: HDP 2.2 >Reporter: Hari Sekhon >Priority: Minor > > HDFS mover gets stuck looping on a block that fails to move and doesn't > migrate the rest of the blocks. > This is preventing recovery of data from a decomissioning external storage > tier used for archive (we've had problems with that proprietary "hyperscale" > storage product which is why a couple blocks here and there have checksum > problems or premature eof as shown below), but this should not prevent moving > all the other blocks to recover our data: > {code}hdfs mover -p /apps/hive/warehouse/ > 15/05/07 14:52:50 INFO mover.Mover: namenodes = > {hdfs://nameservice1=[/apps/hive/warehouse/]} > 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > .. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8341) (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning external
[ https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon reopened HDFS-8341: --- > (Summary & Description may be invalid) HDFS mover stuck in loop after failing > to move block, doesn't move rest of blocks, can't get data back off > decommissioning external storage tier as a result > --- > > Key: HDFS-8341 > URL: https://issues.apache.org/jira/browse/HDFS-8341 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.6.0 > Environment: HDP 2.2 >Reporter: Hari Sekhon >Priority: Minor > > HDFS mover gets stuck looping on a block that fails to move and doesn't > migrate the rest of the blocks. > This is preventing recovery of data from a decomissioning external storage > tier used for archive (we've had problems with that proprietary "hyperscale" > storage product which is why a couple blocks here and there have checksum > problems or premature eof as shown below), but this should not prevent moving > all the other blocks to recover our data: > {code}hdfs mover -p /apps/hive/warehouse/ > 15/05/07 14:52:50 INFO mover.Mover: namenodes = > {hdfs://nameservice1=[/apps/hive/warehouse/]} > 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > .. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8341) HDFS mover stuck in loop trying to move corrupt block with no other valid replicas, doesn't move rest of other data blocks
[ https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated HDFS-8341: -- Summary: HDFS mover stuck in loop trying to move corrupt block with no other valid replicas, doesn't move rest of other data blocks (was: HDFS mover stuck in loop trying to move corrupt block with no other valid replicas, doesn't move rest of other data blocks, can't get data back off decommissioning external storage tier as a result) > HDFS mover stuck in loop trying to move corrupt block with no other valid > replicas, doesn't move rest of other data blocks > -- > > Key: HDFS-8341 > URL: https://issues.apache.org/jira/browse/HDFS-8341 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.6.0 > Environment: HDP 2.2 >Reporter: Hari Sekhon >Priority: Minor > > HDFS mover gets stuck looping on a block that fails to move and doesn't > migrate the rest of the blocks. > This is preventing recovery of data from a decomissioning external storage > tier used for archive (we've had problems with that proprietary "hyperscale" > storage product which is why a couple blocks here and there have checksum > problems or premature eof as shown below), but this should not prevent moving > all the other blocks to recover our data: > {code}hdfs mover -p /apps/hive/warehouse/ > 15/05/07 14:52:50 INFO mover.Mover: namenodes = > {hdfs://nameservice1=[/apps/hive/warehouse/]} > 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > .. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8550) Erasure Coding: Fix FindBugs Multithreaded correctness Warning
[ https://issues.apache.org/jira/browse/HDFS-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802743#comment-14802743 ] Hadoop QA commented on HDFS-8550: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 48s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 45s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 55s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 16s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 31s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 33s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 11s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 225m 53s | Tests failed in hadoop-hdfs. | | | | 268m 12s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.datanode.TestFsDatasetCache | | | hadoop.hdfs.server.namenode.TestFileTruncate | | | hadoop.hdfs.TestWriteStripedFileWithFailure | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.TestRollingUpgrade | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756408/HDFS-8550-HDFS-7285-01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / ced438a | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12503/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12503/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12503/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12503/console | This message was automatically generated. > Erasure Coding: Fix FindBugs Multithreaded correctness Warning > -- > > Key: HDFS-8550 > URL: https://issues.apache.org/jira/browse/HDFS-8550 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8550-HDFS-7285-00.patch, > HDFS-8550-HDFS-7285-01.patch > > > Please find the findbug warnings > [report|https://builds.apache.org/job/PreCommit-HDFS-Build/12444/artifact/patchprocess/patchFindbugsWarningshadoop-hdfs.html] > 1) {code} > Bug type IS2_INCONSISTENT_SYNC (click for details) > In class org.apache.hadoop.hdfs.DFSStripedInputStream > Field org.apache.hadoop.hdfs.DFSStripedInputStream.curStripeBuf > Synchronized 90% of the time > Unsynchronized access at DFSStripedInputStream.java:[line 829] > Synchronized access at DFSStripedInputStream.java:[line 183] > Synchronized access at DFSStripedInputStream.java:[line 186] > Synchronized access at DFSStripedInputStream.java:[line 184] > Synchronized access at DFSStripedInputStream.java:[line 382] > Synchronized access at DFSStripedInputStream.java:[line 460] > Synchronized access at DFSStripedInputStream.java:[line 461] > Synchronized access at DFSStripedInputStream.java:[line 461] > Synchronized access at DFSStripedInputStream.java:[line 285] > Synchronized access at DFSStripedInputStream.java:[line 297] > Synchronized access at DFSStripedInputStream.java:[line 298] > {code} > 2) > {code} > Unread field: > org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock > Bug type URF_UNREAD_FIELD (click for details) > In class org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo > Field org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock > At DFSStripedInputStream.java:[line 126] > {code} > 3) > {code} > Unchecked/unconfirmed cast from org.apache.hadoop.hdfs.protocol.LocatedBlock > to
[jira] [Updated] (HDFS-8341) HDFS mover stuck in loop trying to move corrupt block with no other valid replicas, doesn't move rest of other data blocks, can't get data back off decommissioning externa
[ https://issues.apache.org/jira/browse/HDFS-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated HDFS-8341: -- Summary: HDFS mover stuck in loop trying to move corrupt block with no other valid replicas, doesn't move rest of other data blocks, can't get data back off decommissioning external storage tier as a result (was: (Summary & Description may be invalid) HDFS mover stuck in loop after failing to move block, doesn't move rest of blocks, can't get data back off decommissioning external storage tier as a result) > HDFS mover stuck in loop trying to move corrupt block with no other valid > replicas, doesn't move rest of other data blocks, can't get data back off > decommissioning external storage tier as a result > - > > Key: HDFS-8341 > URL: https://issues.apache.org/jira/browse/HDFS-8341 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer & mover >Affects Versions: 2.6.0 > Environment: HDP 2.2 >Reporter: Hari Sekhon >Priority: Minor > > HDFS mover gets stuck looping on a block that fails to move and doesn't > migrate the rest of the blocks. > This is preventing recovery of data from a decomissioning external storage > tier used for archive (we've had problems with that proprietary "hyperscale" > storage product which is why a couple blocks here and there have checksum > problems or premature eof as shown below), but this should not prevent moving > all the other blocks to recover our data: > {code}hdfs mover -p /apps/hive/warehouse/ > 15/05/07 14:52:50 INFO mover.Mover: namenodes = > {hdfs://nameservice1=[/apps/hive/warehouse/]} > 15/05/07 14:52:51 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 15/05/07 14:52:51 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:51 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 15/05/07 14:52:52 INFO block.BlockTokenSecretManager: Setting block keys > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:52:52 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 INFO net.NetworkTopology: Adding a new node: > /default-rack/:1019 > 15/05/07 14:53:31 WARN balancer.Dispatcher: Failed to move > blk_1075156654_1438349 with size=134217728 from :1019:ARCHIVE to > :1019:DISK through :1019: block move is failed: opReplaceBlock > BP-120244285--1417023863606:blk_1075156654_1438349 received exception > java.io.EOFException: Premature EOF: no length prefix available > .. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791779#comment-14791779 ] Konstantin Shvachko commented on HDFS-3107: --- I would say absolutely go for it. As that was one of the motivations for truncate to support nfs and fuse apis. Let's check with the authors what is the state of nfs with respect to truncate. [~brandonli] could you please elaborate. > HDFS truncate > - > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Lei Chang >Assignee: Plamen Jeliazkov > Fix For: 2.7.0 > > Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, > HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, > HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, > HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, > HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, > HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree
[ https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791829#comment-14791829 ] Hadoop QA commented on HDFS-9053: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 44s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 7 new or modified test files. | | {color:red}-1{color} | javac | 7m 35s | The applied patch generated 28 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 50s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 45s | The applied patch generated 16 new checkstyle issues (total was 0, now 16). | | {color:red}-1{color} | whitespace | 0m 11s | The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 15s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 29s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 160m 56s | Tests failed in hadoop-hdfs. | | | | 228m 34s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.web.TestWebHDFSOAuth2 | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756270/HDFS-9053.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 0832b38 | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/diffJavacWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/diffcheckstylehadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12501/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12501/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12501/console | This message was automatically generated. > Support large directories efficiently using B-Tree > -- > > Key: HDFS-9053 > URL: https://issues.apache.org/jira/browse/HDFS-9053 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Yi Liu >Assignee: Yi Liu >Priority: Critical > Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 > (BTree).patch, HDFS-9053.001.patch > > > This is a long standing issue, we were trying to improve this in the past. > Currently we use an ArrayList for the children under a directory, and the > children are ordered in the list, for insert/delete/search, the time > complexity is O(log n), but insertion/deleting causes re-allocations and > copies of big arrays, so the operations are costly. For example, if the > children grow to 1M size, the ArrayList will resize to > 1M capacity, so need > > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS > cluster where namenode heap memory is already highly used. I recap the 3 > main issues: > # Insertion/deletion operations in large directories are expensive because > re-allocations and copies of big arrays. > # Dynamically allocate several MB continuous heap memory which will be > long-lived can easily cause full GC problem. > # Even most children are removed later, but the directory INode still > occupies same size heap memory, since the ArrayList will never shrink. > This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to > solve the problem suggested by [~shv]. > So the target of this JIRA is to implement a low memory footprint B-Tree and > use it to replace ArrayList. > If the elements size is not large (less than the maximum degree of B-Tree > node), the B-Tree only has one root node which contains an array for the > elements. And if the
[jira] [Updated] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
[ https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8632: --- Status: Patch Available (was: Open) > Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes > -- > > Key: HDFS-8632 > URL: https://issues.apache.org/jira/browse/HDFS-8632 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8632-HDFS-7285-00.patch, > HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch > > > I've noticed some of the erasure coding classes missing > {{@InterfaceAudience}} annotation. It would be good to identify the classes > and add proper annotation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-8873: --- Attachment: HDFS-8873.003.patch Reposting patch. > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, > HDFS-8873.003.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803149#comment-14803149 ] Hadoop QA commented on HDFS-8873: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 13s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 53s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 14s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 26s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 34s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 7s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 34s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 130m 1s | Tests failed in hadoop-hdfs. | | | | 173m 34s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.TestParallelShortCircuitRead | | | hadoop.hdfs.server.namenode.TestAllowFormat | | | hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens | | | hadoop.hdfs.TestBlockStoragePolicy | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotMetrics | | | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshottableDirListing | | | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots | | | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate | | | hadoop.hdfs.server.namenode.TestCheckpoint | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.TestRemoteBlockReader2 | | | hadoop.hdfs.server.namenode.TestStartup | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestDFSStorageStateRecovery | | | hadoop.hdfs.server.namenode.TestFSImageWithXAttr | | | hadoop.hdfs.TestRemoteBlockReader | | | hadoop.hdfs.TestMultiThreadedHflush | | | hadoop.hdfs.TestBlockReaderLocal | | | hadoop.hdfs.server.mover.TestMover | | | hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks | | | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality | | | hadoop.hdfs.server.namenode.TestFSImageWithAcl | | | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete | | | hadoop.hdfs.TestPread | | | hadoop.hdfs.server.namenode.TestFSEditLogLoader | | | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA | | | hadoop.hdfs.crypto.TestHdfsCryptoStreams | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl | | | hadoop.hdfs.TestDFSAddressConfig | | | hadoop.hdfs.server.namenode.TestFSDirectory | | | hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots | | | hadoop.hdfs.TestParallelRead | | | hadoop.hdfs.TestRestartDFS | | | hadoop.hdfs.TestParallelShortCircuitReadNoChecksum | | | hadoop.hdfs.TestParallelShortCircuitLegacyRead | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | | | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.server.namenode.TestDeadDatanode | | | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.server.namenode.ha.TestFailoverWithBlockTokensEnabled | | | hadoop.hdfs.TestFetchImage | | | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication | | | hadoop.hdfs.server.namenode.ha.TestDNFencing | | | hadoop.hdfs.TestDFSUpgrade | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap | | | hadoop.hdfs.server.namenode.TestNamenodeRetryCache | | | hadoop.hdfs.TestMissingBlocksAlert | | | hadoop.hdfs.server.namenode.ha.TestHAMetrics | | | hadoop.hdfs.TestQuota | | | hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot | | | hadoop.hdfs.server.namenode.TestQuotaByStorageType | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion | | | hadoop.hdfs.server.namenode.TestStorageRestore | | | hadoop.hdfs.server.namenode.TestSaveNamespace | | | hadoop.hdfs.server.namenode.TestParallelImageWrite | | | hadoop.hdfs.tools.TestDebugAdmin | | | hadoop.hdfs.TestPersistBlocks |
[jira] [Commented] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset
[ https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803187#comment-14803187 ] Hadoop QA commented on HDFS-9095: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756394/HDFS-9095.000.patch | | Optional Tests | javadoc javac unit | | git revision | trunk / 58d1a02 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12508/console | This message was automatically generated. > RPC client should fail gracefully when the connection is timed out or reset > --- > > Key: HDFS-9095 > URL: https://issues.apache.org/jira/browse/HDFS-9095 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-9095.000.patch > > > The RPC client should fail gracefully when the connection is timed out or > reset. instead of bailing out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
[ https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8632: --- Attachment: HDFS-8632-HDFS-7285-03.patch > Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes > -- > > Key: HDFS-8632 > URL: https://issues.apache.org/jira/browse/HDFS-8632 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8632-HDFS-7285-00.patch, > HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch, > HDFS-8632-HDFS-7285-03.patch > > > I've noticed some of the erasure coding classes missing > {{@InterfaceAudience}} annotation. It would be good to identify the classes > and add proper annotation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803170#comment-14803170 ] Haohui Mai commented on HDFS-9022: -- The patch looks good to me. bq. This is expected as we will file a new jira to replace the NameNode.getUri() with DFSUtilClient.getNNUri(). See comments above Can you please file the jira and link it to this jira? > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, > HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9093) Initialize protobuf fields in RemoteBlockReaderTest
[ https://issues.apache.org/jira/browse/HDFS-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803192#comment-14803192 ] Hadoop QA commented on HDFS-9093: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756393/HDFS-9093.000.patch | | Optional Tests | javac unit | | git revision | trunk / 58d1a02 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12510/console | This message was automatically generated. > Initialize protobuf fields in RemoteBlockReaderTest > --- > > Key: HDFS-9093 > URL: https://issues.apache.org/jira/browse/HDFS-9093 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-9093.000.patch > > > Protobuf 2.6.1 complains that the {{ExtendedBlockProto}} objects in > {{remote_block_reader_test.cc}} are not initialized. > The test should be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9095) RPC client should fail gracefully when the connection is timed out or reset
[ https://issues.apache.org/jira/browse/HDFS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-9095: - Status: Patch Available (was: Open) > RPC client should fail gracefully when the connection is timed out or reset > --- > > Key: HDFS-9095 > URL: https://issues.apache.org/jira/browse/HDFS-9095 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-9095.000.patch > > > The RPC client should fail gracefully when the connection is timed out or > reset. instead of bailing out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9093) Initialize protobuf fields in RemoteBlockReaderTest
[ https://issues.apache.org/jira/browse/HDFS-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-9093: - Status: Patch Available (was: Open) > Initialize protobuf fields in RemoteBlockReaderTest > --- > > Key: HDFS-9093 > URL: https://issues.apache.org/jira/browse/HDFS-9093 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-9093.000.patch > > > Protobuf 2.6.1 complains that the {{ExtendedBlockProto}} objects in > {{remote_block_reader_test.cc}} are not initialized. > The test should be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
[ https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803173#comment-14803173 ] Rakesh R commented on HDFS-8632: It seems there are few [findbug warnings on LocatedStripedBlock class|https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/patchFindbugsWarningshadoop-hdfs-client.html#Warnings_MALICIOUS_CODE]. Attached another patch fixing the same. > Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes > -- > > Key: HDFS-8632 > URL: https://issues.apache.org/jira/browse/HDFS-8632 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8632-HDFS-7285-00.patch, > HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch, > HDFS-8632-HDFS-7285-03.patch > > > I've noticed some of the erasure coding classes missing > {{@InterfaceAudience}} annotation. It would be good to identify the classes > and add proper annotation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802910#comment-14802910 ] Kihwal Lee commented on HDFS-8873: -- Shall we target 2.7.2? > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, > HDFS-8873.003.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
[ https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14802949#comment-14802949 ] Hadoop QA commented on HDFS-8632: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 2s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 43s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 15s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 10s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 10s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 42s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 6m 51s | The patch appears to introduce 7 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 23m 40s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 0m 22s | Tests failed in hadoop-hdfs. | | {color:red}-1{color} | hdfs tests | 0m 19s | Tests failed in hadoop-hdfs-client. | | | | 72m 8s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | FindBugs | module:hadoop-hdfs-client | | Failed unit tests | hadoop.fs.contract.localfs.TestLocalFSContractMkdir | | Failed build | hadoop-hdfs | | | hadoop-hdfs-client | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756142/HDFS-8632-HDFS-7285-02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / ced438a | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/patchReleaseAuditProblems.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-client.html | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12505/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12505/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12505/console | This message was automatically generated. > Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes > -- > > Key: HDFS-8632 > URL: https://issues.apache.org/jira/browse/HDFS-8632 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8632-HDFS-7285-00.patch, > HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch > > > I've noticed some of the erasure coding classes missing > {{@InterfaceAudience}} annotation. It would be good to identify the classes > and add proper annotation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated HDFS-8873: --- Attachment: (was: HDFS-8873.003.patch) > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy
Zhe Zhang created HDFS-9097: --- Summary: Erasure coding: update EC command "-s" flag to "-p" when specifying policy Key: HDFS-9097 URL: https://issues.apache.org/jira/browse/HDFS-9097 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Zhe Zhang Assignee: Zhe Zhang HDFS-8833 missed this update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8647) Abstract BlockManager's rack policy into BlockPlacementPolicy
[ https://issues.apache.org/jira/browse/HDFS-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803246#comment-14803246 ] Brahma Reddy Battula commented on HDFS-8647: {{TestBlocksWithNotEnoughRacks}} are failing due to following cases. *Before change* if cluster was multi rack first and then later became single rack, then those blocks were getting added in {{NeededReplications}}, and the same value was expected in tests. a. But this was only till the namenode was alive. If NN was restarted after making into single rack, this {{NeededReplications}} will not have the block. b. And before NN restart, another rack was added, then auto replication happens to new rack. But Once NN restarted, and new rack added, then auto replication to new rack (if single rack have already enough replicas == RF) to new rack happens only if RF changes on those blocks. *After change* it will not add to {{NeededReplications}} immediately cluster became the single rack. a. After change, auto replication (if single rack have already enough replicas == RF) will not happen when the cluster is added with one more rack. Only this will be triggered only if RF changes on those blocks. If new change is Okay, then test case can be updated, else patch can be updated. > Abstract BlockManager's rack policy into BlockPlacementPolicy > - > > Key: HDFS-8647 > URL: https://issues.apache.org/jira/browse/HDFS-8647 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ming Ma >Assignee: Brahma Reddy Battula > Attachments: HDFS-8647-001.patch, HDFS-8647-002.patch, > HDFS-8647-003.patch > > > Sometimes we want to have namenode use alternative block placement policy > such as upgrade domains in HDFS-7541. > BlockManager has built-in assumption about rack policy in functions such as > useDelHint, blockHasEnoughRacks. That means when we have new block placement > policy, we need to modify BlockManager to account for the new policy. Ideally > BlockManager should ask BlockPlacementPolicy object instead. That will allow > us to provide new BlockPlacementPolicy without changing BlockManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy
[ https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9097: Status: Patch Available (was: Open) > Erasure coding: update EC command "-s" flag to "-p" when specifying policy > -- > > Key: HDFS-9097 > URL: https://issues.apache.org/jira/browse/HDFS-9097 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > > HDFS-8833 missed this update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy
[ https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9097: Attachment: HDFS-9097-HDFS-7285.00.patch > Erasure coding: update EC command "-s" flag to "-p" when specifying policy > -- > > Key: HDFS-9097 > URL: https://issues.apache.org/jira/browse/HDFS-9097 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9097-HDFS-7285.00.patch > > > HDFS-8833 missed this update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9063) Correctly handle snapshot path for getContentSummary
[ https://issues.apache.org/jira/browse/HDFS-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803241#comment-14803241 ] Jing Zhao commented on HDFS-9063: - Thanks Yi! {{getContentSummary(dir's current path)}} should include (all the current files/directories) + (all the deleted files/directories but still in snapshots). Thus in the above case, the return value 16 in step 6 is correct: we have 15 files in the current dir, and the original first file in dir/.snapshot/s1. > Correctly handle snapshot path for getContentSummary > > > Key: HDFS-9063 > URL: https://issues.apache.org/jira/browse/HDFS-9063 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9063.000.patch > > > The current getContentSummary implementation does not take into account the > snapshot path, thus if we have the following ops: > 1. create dirs /foo/bar > 2. take snapshot s1 on /foo > 3. create a 1 byte file /foo/bar/baz > then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", > which is incorrect since the 1 byte file is not included in snapshot s1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Attachment: (was: HDFS-5802.002.patch) > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline
Zhe Zhang created HDFS-9098: --- Summary: Erasure coding: emulate race conditions among striped streamers in write pipeline Key: HDFS-9098 URL: https://issues.apache.org/jira/browse/HDFS-9098 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Apparently the interleaving of events among {{StripedDataStreamer}}s is very tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race conditions under HDFS-9040. Let's use FaultInjector to emulate different combinations of interleaved events. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9040) Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests to Coordinator)
[ https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803316#comment-14803316 ] Zhe Zhang commented on HDFS-9040: - bq. 3. We might want to add the logic to replace a failed StripedDataStreamer in the future. bq. No, we won't. I think so? if you're talking something like Datanode replacement for repl block. You can transfer a healthy repl RBW to a new Datanode, then you still get 3 DNs after replacement. But recover a corrupted RBW internal block is difficult. I agree it's difficult and in this phase I don't think it's necessary. We cannot rule out the possibility though. In current non-EC pipeline we support multiple failover options. A fast writer can opt out in DN replacement and instead rely on background re-replication. A slow writer might want to replace DN to prevent data loss during the long window. For a slow EC writer we should consider fixing the pipeline as well, especially at the early stage of writing a block (not too much data to decode). bq. 1. A client read UC block being written can decode replica if it misses some part. ( With checksum verification, we are only concern about 'missing') Interesting thought. But {{verifyChecksum}} is optional so we can't always rely on it. If {{verifyChecksum}} becomes mandatory much of our corrupt replica handling logic can be much simpler. > Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests > to Coordinator) > --- > > Key: HDFS-9040 > URL: https://issues.apache.org/jira/browse/HDFS-9040 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su > Attachments: HDFS-9040-HDFS-7285.002.patch, > HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, > HDFS-9040.02.bgstreamer.patch > > > The general idea is to simplify error handling logic. > Proposal 1: > A BlockGroupDataStreamer to communicate with NN to allocate/update block, and > StripedDataStreamer s only have to stream blocks to DNs. > Proposal 2: > See below the > [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388] > from [~jingzhao]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"
[ https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803367#comment-14803367 ] Surendra Singh Lilhore commented on HDFS-9089: -- bq. Why ".system" is better than "system"? The same argument applies – what if users want to create ".system"? We thought , user will not create directory like "{{.system}}" :) > Balancer and Mover should use ".system" as reserved inode name instead of > "system" > -- > > Key: HDFS-9089 > URL: https://issues.apache.org/jira/browse/HDFS-9089 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Reporter: Archana T >Assignee: Surendra Singh Lilhore > Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch > > > Currently Balancer and Mover create "/system" for placing mover.id and > balancer.id > hdfs dfs -ls / > drwxr-xr-x - root hadoop 0 2015-09-16 12:49 > {color:red}/system{color} > This folder created in not deleted once mover or balancer work is completed > So user cannot create dir "system" . > Its better to make ".system" as reserved inode for balancer and mover instead > of "system". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
[ https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803375#comment-14803375 ] Hadoop QA commented on HDFS-8632: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 1s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 48s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 20s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 6s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 10s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 6m 32s | The patch appears to introduce 4 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 23m 49s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 41m 4s | Tests failed in hadoop-hdfs. | | {color:red}-1{color} | hdfs tests | 0m 20s | Tests failed in hadoop-hdfs-client. | | | | 112m 42s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForContentSummary | | | hadoop.hdfs.TestSafeMode | | | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.TestModTime | | | hadoop.hdfs.server.blockmanagement.TestBlockManager | | | hadoop.hdfs.server.datanode.TestReadOnlySharedStorage | | | hadoop.hdfs.server.datanode.TestIncrementalBlockReports | | | hadoop.hdfs.TestReservedRawPaths | | | hadoop.hdfs.server.namenode.TestClusterId | | | hadoop.hdfs.server.namenode.ha.TestFailoverWithBlockTokensEnabled | | | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlockQueues | | | hadoop.hdfs.web.TestHttpsFileSystem | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.TestRemoteBlockReader2 | | | hadoop.hdfs.protocol.TestBlockListAsLongs | | | hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotRename | | | hadoop.hdfs.TestReplication | | | hadoop.hdfs.TestBlocksScheduledCounter | | | hadoop.hdfs.qjournal.client.TestQJMWithFaults | | | hadoop.hdfs.server.namenode.TestNameNodeAcl | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | | hadoop.hdfs.TestDFSInotifyEventInputStream | | | hadoop.hdfs.TestAbandonBlock | | | hadoop.hdfs.TestSetTimes | | | hadoop.hdfs.server.namenode.ha.TestEditLogsDuringFailover | | | hadoop.hdfs.server.namenode.TestNameEditsConfigs | | | hadoop.hdfs.TestDFSFinalize | | | hadoop.hdfs.web.TestFSMainOperationsWebHdfs | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby | | | hadoop.hdfs.server.namenode.ha.TestStandbyBlockManagement | | | hadoop.hdfs.server.namenode.TestCommitBlockSynchronization | | | hadoop.hdfs.server.namenode.TestNameNodeRecovery | | | hadoop.hdfs.server.namenode.TestAuditLogs | | | hadoop.hdfs.server.namenode.TestStorageRestore | | | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion | | | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots | | | hadoop.hdfs.server.namenode.ha.TestInitializeSharedEdits | | | hadoop.hdfs.TestAppendDifferentChecksum | | | hadoop.hdfs.server.namenode.TestEditLogRace | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart | | | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover | | | hadoop.hdfs.server.namenode.TestNameNodeResourceChecker | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.TestHDFSFileSystemContract | | | hadoop.hdfs.tools.TestDFSZKFailoverController | | | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport | | | hadoop.hdfs.TestDFSShell | | | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | | hadoop.hdfs.TestMissingBlocksAlert | | |
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Status: Patch Available (was: Open) > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, > HDFS-5802.003.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Attachment: HDFS-5802.003.patch Thanks Yongjun for the additional comments! I have fixed it and uploaded patch 003. > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, > HDFS-5802.003.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7492) If multiple threads call FsVolumeList#checkDirs at the same time, we should only do checkDirs once and give the results to all waiting threads
[ https://issues.apache.org/jira/browse/HDFS-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803419#comment-14803419 ] Colin Patrick McCabe commented on HDFS-7492: [~eclark], also check HDFS-8845 for another improvement in this area. > If multiple threads call FsVolumeList#checkDirs at the same time, we should > only do checkDirs once and give the results to all waiting threads > -- > > Key: HDFS-7492 > URL: https://issues.apache.org/jira/browse/HDFS-7492 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Colin Patrick McCabe >Assignee: Elliott Clark >Priority: Minor > > checkDirs is called when we encounter certain I/O errors. It's rare to get > just a single I/O error... normally you start getting many errors when a disk > is going bad. For this reason, we shouldn't start a new checkDirs scan for > each error. Instead, if multiple threads call FsVolumeList#checkDirs at > around the same time, we should only do checkDirs once and give the results > to all the waiting threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9063) Correctly handle snapshot path for getContentSummary
[ https://issues.apache.org/jira/browse/HDFS-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803250#comment-14803250 ] Jing Zhao commented on HDFS-9063: - The test failures look suspicious. I just triggered the Jenkins again. > Correctly handle snapshot path for getContentSummary > > > Key: HDFS-9063 > URL: https://issues.apache.org/jira/browse/HDFS-9063 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9063.000.patch > > > The current getContentSummary implementation does not take into account the > snapshot path, thus if we have the following ops: > 1. create dirs /foo/bar > 2. take snapshot s1 on /foo > 3. create a 1 byte file /foo/bar/baz > then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", > which is incorrect since the 1 byte file is not included in snapshot s1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803321#comment-14803321 ] Zhe Zhang commented on HDFS-8808: - Triggering Jenkins again. > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, > HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9089) Balancer and Mover should use ".system" as reserved inode name instead of "system"
[ https://issues.apache.org/jira/browse/HDFS-9089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803332#comment-14803332 ] Surendra Singh Lilhore commented on HDFS-9089: -- Thanks [~szetszwo] for comment. The main purpose of this jira is to delete directory {{system}} which is created by Mover and Balancer after completing task, but we can't delete {{system}} directory directly because it may be created by user. We thought we can use one reserve directory name like {{.system}}, so we can delete it after completing mover and balancer task. bq. This looks like an incompatible change. I didn't get, how it is incompatible ?. > Balancer and Mover should use ".system" as reserved inode name instead of > "system" > -- > > Key: HDFS-9089 > URL: https://issues.apache.org/jira/browse/HDFS-9089 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Reporter: Archana T >Assignee: Surendra Singh Lilhore > Attachments: HDFS-9089.01.patch, HDFS-9089.02.patch > > > Currently Balancer and Mover create "/system" for placing mover.id and > balancer.id > hdfs dfs -ls / > drwxr-xr-x - root hadoop 0 2015-09-16 12:49 > {color:red}/system{color} > This folder created in not deleted once mover or balancer work is completed > So user cannot create dir "system" . > Its better to make ".system" as reserved inode for balancer and mover instead > of "system". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803351#comment-14803351 ] Owen O'Malley commented on HDFS-8855: - I'm looking at the patch, but you'll need to resolve the checkstyle, findbugs, and test case failures. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8968) New benchmark throughput tool for striping erasure coding
[ https://issues.apache.org/jira/browse/HDFS-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803386#comment-14803386 ] Andrew Wang commented on HDFS-8968: --- In that case maybe we throw it in hadoop-tools? The only concern there is that without unittests the code won't be exercised regularly, and it might get out of date. I think we could make it run against both a real cluster and a MiniDFSCluster also, since ultimately we're just using the FileSystem API. > New benchmark throughput tool for striping erasure coding > - > > Key: HDFS-8968 > URL: https://issues.apache.org/jira/browse/HDFS-8968 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kai Zheng >Assignee: Rui Li > Attachments: HDFS-8968-HDFS-7285.1.patch, HDFS-8968-HDFS-7285.2.patch > > > We need a new benchmark tool to measure the throughput of client writing and > reading considering cases or factors: > * 3-replica or striping; > * write or read, stateful read or positional read; > * which erasure coder; > * striping cell size; > * concurrent readers/writers using processes or threads. > The tool should be easy to use and better to avoid unnecessary local > environment impact, like local disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy
[ https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803396#comment-14803396 ] Andrew Wang commented on HDFS-9097: --- +1 LGTM, thanks for the update Zhe > Erasure coding: update EC command "-s" flag to "-p" when specifying policy > -- > > Key: HDFS-9097 > URL: https://issues.apache.org/jira/browse/HDFS-9097 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9097-HDFS-7285.00.patch > > > HDFS-8833 missed this update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803388#comment-14803388 ] Yongjun Zhang commented on HDFS-5802: - Hi [~xiaochen], Thanks for the new rev. Rev 2 looks good to me, two small nits {code} 195 /** 196* Check whether an exception is due to inode type not directory 197*/ 198 private void checkAncestorType(INode[] inodes, int ancestorIndex, 199 AccessControlException e) throws AccessControlException { {code} 1. Suggest to change the comment above to "Check whether exception e is due to an ancestor inode's not being directory" 2. indention of line 199 should be 4 +1 after that, pending jenkins tests. Thanks. > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline
[ https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9098: Description: Apparently the interleaving of events among {{StripedDataStreamer}}'s is very tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race conditions under HDFS-9040. Let's use FaultInjector to emulate different combinations of interleaved events. In particular, we should consider inject delays in the following places: # {{Streamer#endBlock}} # {{Streamer#locateFollowingBlock}} # {{Streamer#updateBlockForPipeline}} # {{Streamer#updatePipeline}} # {{OutputStream#writeChunk}} # {{OutputStream#close}} was: Apparently the interleaving of events among {{StripedDataStreamer}}s is very tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race conditions under HDFS-9040. Let's use FaultInjector to emulate different combinations of interleaved events. > Erasure coding: emulate race conditions among striped streamers in write > pipeline > - > > Key: HDFS-9098 > URL: https://issues.apache.org/jira/browse/HDFS-9098 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Zhe Zhang > > Apparently the interleaving of events among {{StripedDataStreamer}}'s is very > tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race > conditions under HDFS-9040. > Let's use FaultInjector to emulate different combinations of interleaved > events. > In particular, we should consider inject delays in the following places: > # {{Streamer#endBlock}} > # {{Streamer#locateFollowingBlock}} > # {{Streamer#updateBlockForPipeline}} > # {{Streamer#updatePipeline}} > # {{OutputStream#writeChunk}} > # {{OutputStream#close}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9088) Cleanup erasure coding documentation
[ https://issues.apache.org/jira/browse/HDFS-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9088: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7285 Status: Resolved (was: Patch Available) I just committed to the feature branch. Thanks Andrew for the work! > Cleanup erasure coding documentation > > > Key: HDFS-9088 > URL: https://issues.apache.org/jira/browse/HDFS-9088 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: documentation >Affects Versions: HDFS-7285 >Reporter: Andrew Wang >Assignee: Andrew Wang > Fix For: HDFS-7285 > > Attachments: hdfs-9088.001.patch, hdfs-9088.002.patch > > > The documentation could use a pass to clean up typos, unify formatting, and > also make it more user-oriented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Attachment: HDFS-5802.002.patch Thanks Yongjun for the review! It makes sense, and I've addressed comment 1-4. (The code was 80 chars, so no action for 5) Submit incremental patch 002 with my changes. > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Status: Open (was: Patch Available) > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Attachment: HDFS-5802.002.patch > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9004) Add upgrade domain to DatanodeInfo
[ https://issues.apache.org/jira/browse/HDFS-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803271#comment-14803271 ] Hadoop QA commented on HDFS-9004: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 16s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 35s | The applied patch generated 5 new checkstyle issues (total was 124, now 127). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 29s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 36s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 129m 0s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 32s | Tests passed in hadoop-hdfs-client. | | | | 180m 45s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestStorageRestore | | | org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot | | | org.apache.hadoop.hdfs.server.namenode.TestNameEditsConfigs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756094/HDFS-9004-2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6c6e734 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12507/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12507/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12507/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12507/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12507/console | This message was automatically generated. > Add upgrade domain to DatanodeInfo > -- > > Key: HDFS-9004 > URL: https://issues.apache.org/jira/browse/HDFS-9004 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9004-2.patch, HDFS-9004.patch > > > As part of upgrade domain feature, we first need to add upgrade domain string > to {{DatanodeInfo}}. It includes things like: > * Add a new field to DatanodeInfo. > * Modify protobuf for DatanodeInfo. > * Update DatanodeInfo.getDatanodeReport to include upgrade domain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Status: Open (was: Patch Available) > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804288#comment-14804288 ] Hadoop QA commented on HDFS-8808: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 9s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 8m 10s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 15s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 29s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 31s | The applied patch generated 1 new checkstyle issues (total was 546, now 546). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 55s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 43s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 44s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 12s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 0m 19s | Tests failed in hadoop-hdfs. | | | | 49m 32s | | \\ \\ || Reason || Tests || | Failed build | hadoop-hdfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751151/HDFS-8808.04.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 58d1a02 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12514/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12514/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12514/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12514/console | This message was automatically generated. > dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby > > > Key: HDFS-8808 > URL: https://issues.apache.org/jira/browse/HDFS-8808 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Gautam Gopalakrishnan >Assignee: Zhe Zhang > Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, > HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch > > > The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the > speed with which the fsimage is copied between the namenodes during regular > use. However, as a side effect, this also limits transfers when the > {{-bootstrapStandby}} option is used. This option is often used during > upgrades and could potentially slow down the entire workflow. The request > here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth > setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Status: Patch Available (was: Open) > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803234#comment-14803234 ] Constantine Peresypkin commented on HDFS-3107: -- Hmm, fast patching nfs did not work. Intermittently fails on "lease already acquired". Seems like nfs gateway holds leases to all files it opened in some sort of cache. Very strange. > HDFS truncate > - > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Lei Chang >Assignee: Plamen Jeliazkov > Fix For: 2.7.0 > > Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, > HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, > HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, > HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, > HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, > HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Status: Patch Available (was: Open) > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9085) Show renewer information in DelegationTokenIdentifier#toString
[ https://issues.apache.org/jira/browse/HDFS-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803307#comment-14803307 ] zhihai xu commented on HDFS-9085: - Thanks for the review [~cnauroth]! That is great information. Yes, it makes sense to commit the patch to trunk only. > Show renewer information in DelegationTokenIdentifier#toString > -- > > Key: HDFS-9085 > URL: https://issues.apache.org/jira/browse/HDFS-9085 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Trivial > Attachments: HDFS-9085.001.patch > > > Show renewer information in {{DelegationTokenIdentifier#toString}}. Currently > {{org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier}} > didn't show the renewer information. It will be very useful to have renewer > information to debug security related issue. Because the renewer will be > filtered by "hadoop.security.auth_to_local", it will be helpful to show the > real renewer info after applying the rules. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803300#comment-14803300 ] Mingliang Liu commented on HDFS-9022: - Thank you [~wheat9] for reviewing the code. I filed jira [MAPREDUCE-6483] for changes in {{hadoop-mapreduce}} module. > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, > HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804279#comment-14804279 ] Daniel Templeton commented on HDFS-8873: [sigh] Those tests pass for me locally, so can't say why they failed. The whitespace error is interesting. I changed line n in the patch. Jenkins complained about the whitespace on line n+1. I fixed the whitespace on line n+1 in the next patch. Jenkins is now complaining about the whitespace on line n+2. There is no issue on line n+3, so I could correct n+2 and be done, but at that point I've made whitespace changes on two lines that I didn't otherwise touch. What's the accepted why to do it? Fix the whitespace or ignore the error? > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, > HDFS-8873.003.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9004) Add upgrade domain to DatanodeInfo
[ https://issues.apache.org/jira/browse/HDFS-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804388#comment-14804388 ] Lei (Eddy) Xu commented on HDFS-9004: - Hi, [~mingma] You patch looks very good in general. It'd be great to address one small change: is the following change necessary in this patch? {code:title=DFSTestUtil.java} 1l, 2l, 3l, 4l, 0l, 0l, 0l, 5, 6, "local", adminState, + ipAddr + ":" + DFSConfigKeys.DFS_DATANODE_DEFAULT_PORT); {code} Will +1 once address the above comments and verify the tests failures are not relevant. > Add upgrade domain to DatanodeInfo > -- > > Key: HDFS-9004 > URL: https://issues.apache.org/jira/browse/HDFS-9004 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9004-2.patch, HDFS-9004.patch > > > As part of upgrade domain feature, we first need to add upgrade domain string > to {{DatanodeInfo}}. It includes things like: > * Add a new field to DatanodeInfo. > * Modify protobuf for DatanodeInfo. > * Update DatanodeInfo.getDatanodeReport to include upgrade domain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7955) Improve naming of classes, methods, and variables related to block replication and recovery
[ https://issues.apache.org/jira/browse/HDFS-7955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804298#comment-14804298 ] Andrew Wang commented on HDFS-7955: --- I noticed with HDFS-8899 the datanode config keys need some renaming: {noformat} public static final String DFS_DATANODE_STRIPED_READ_THREADS_KEY = "dfs.datanode.stripedread.threads"; public static final int DFS_DATANODE_STRIPED_READ_THREADS_DEFAULT = 20; public static final String DFS_DATANODE_STRIPED_READ_BUFFER_SIZE_KEY = "dfs.datanode.stripedread.buffer.size"; public static final int DFS_DATANODE_STRIPED_READ_BUFFER_SIZE_DEFAULT = 64 * 1024; public static final String DFS_DATANODE_STRIPED_READ_TIMEOUT_MILLIS_KEY = "dfs.datanode.stripedread.timeout.millis"; public static final int DFS_DATANODE_STRIPED_READ_TIMEOUT_MILLIS_DEFAULT = 5000; //5s public static final String DFS_DATANODE_STRIPED_BLK_RECOVERY_THREADS_KEY = "dfs.datanode.striped.blockrecovery.threads.size"; public static final int DFS_DATANODE_STRIPED_BLK_RECOVERY_THREADS_DEFAULT = 8; {noformat} The term "block recovery" is overloaded here, I'd recommend "reconstruction" instead. All of these config keys are also for ECWorker and related, so should also have the same prefix, e.g. "dfs.datanode.ec.reconstruction" or something. IIUC there's a "read" thread pool and a "compute" thread pool; that distinction hopefully is also made apparent in the key naming and descriptions. > Improve naming of classes, methods, and variables related to block > replication and recovery > --- > > Key: HDFS-7955 > URL: https://issues.apache.org/jira/browse/HDFS-7955 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Zhe Zhang >Assignee: Rakesh R > Attachments: HDFS-7955-001.patch > > > Many existing names should be revised to avoid confusion when blocks can be > both replicated and erasure coded. This JIRA aims to solicit opinions on > making those names more consistent and intuitive. > # In current HDFS _block recovery_ refers to the process of finalizing the > last block of a file, triggered by _lease recovery_. It is different from the > intuitive meaning of _recovering a lost block_. To avoid confusion, I can > think of 2 options: > #* Rename this process as _block finalization_ or _block completion_. I > prefer this option because this is literally not a recovery. > #* If we want to keep existing terms unchanged we can name all EC recovery > and re-replication logics as _reconstruction_. > # As Kai [suggested | > https://issues.apache.org/jira/browse/HDFS-7369?focusedCommentId=14361131=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14361131] > under HDFS-7369, several replication-based names should be made more generic: > #* {{UnderReplicatedBlocks}} and {{neededReplications}}. E.g. we can use > {{LowRedundancyBlocks}}/{{AtRiskBlocks}}, and > {{neededRecovery}}/{{neededReconstruction}}. > #* {{PendingReplicationBlocks}} > #* {{ReplicationMonitor}} > I'm sure the above list is incomplete; discussions and comments are very > welcome. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9047) deprecate libwebhdfs in branch-2; remove from trunk
[ https://issues.apache.org/jira/browse/HDFS-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804319#comment-14804319 ] Colin Patrick McCabe commented on HDFS-9047: Like I said, I don't have any objection to replacing libwebhdfs with some code that's better and does the same job. I just don't think we should remove it with no replacement. > deprecate libwebhdfs in branch-2; remove from trunk > --- > > Key: HDFS-9047 > URL: https://issues.apache.org/jira/browse/HDFS-9047 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Allen Wittenauer > > This library is basically a mess: > * It's not part of the mvn package > * It's missing functionality and barely maintained > * It's not in the precommit runs so doesn't get exercised regularly > * It's not part of the unit tests (at least, that I can see) > * It isn't documented in any official documentation > But most importantly: > * It fails at it's primary mission of being pure C (HDFS-3917 is STILL open) > Let's cut our losses and just remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804459#comment-14804459 ] Haohui Mai commented on HDFS-9022: -- +1 > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client, namenode >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, > HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9063) Correctly handle snapshot path for getContentSummary
[ https://issues.apache.org/jira/browse/HDFS-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804482#comment-14804482 ] Hadoop QA commented on HDFS-9063: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 4s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 15s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 24s | The applied patch generated 2 new checkstyle issues (total was 177, now 179). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 31s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 10s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 164m 2s | Tests failed in hadoop-hdfs. | | | | 209m 56s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.namenode.TestNameNodeResourceChecker | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12756374/HDFS-9063.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 58d1a02 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12511/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12511/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12511/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12511/console | This message was automatically generated. > Correctly handle snapshot path for getContentSummary > > > Key: HDFS-9063 > URL: https://issues.apache.org/jira/browse/HDFS-9063 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Jing Zhao >Assignee: Jing Zhao > Attachments: HDFS-9063.000.patch > > > The current getContentSummary implementation does not take into account the > snapshot path, thus if we have the following ops: > 1. create dirs /foo/bar > 2. take snapshot s1 on /foo > 3. create a 1 byte file /foo/bar/baz > then "du /foo" and "du /foo/.snapshot/s1" can report same results for "bar", > which is incorrect since the 1 byte file is not included in snapshot s1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8967) Create a BlockManagerLock class to represent the lock used in the BlockManager
[ https://issues.apache.org/jira/browse/HDFS-8967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804485#comment-14804485 ] Hadoop QA commented on HDFS-8967: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753563/HDFS-8967.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3f82f58 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12516/console | This message was automatically generated. > Create a BlockManagerLock class to represent the lock used in the BlockManager > -- > > Key: HDFS-8967 > URL: https://issues.apache.org/jira/browse/HDFS-8967 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-8967.000.patch, HDFS-8967.001.patch, > HDFS-8967.002.patch > > > This jira proposes to create a {{BlockManagerLock}} class to represent the > lock used in {{BlockManager}}. > Currently it directly points to the {{FSNamesystem}} lock thus there are no > functionality changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis
[ https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9086: -- Attachment: hdfs-9086-hdfs-7285.001.patch Patch attached doing this rename > Rename dfs.datanode.stripedread.threshold.millis to > dfs.datanode.stripedread.timeout.millis > --- > > Key: HDFS-9086 > URL: https://issues.apache.org/jira/browse/HDFS-9086 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: HDFS-7285 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Trivial > Attachments: hdfs-9086-hdfs-7285.001.patch > > > This config key is used to control the timeout for ECWorker reads, let's name > it with the standard term "timeout" rather than "threshold". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Status: Open (was: Patch Available) > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, > HDFS-5802.003.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy
[ https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9097: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7285 Status: Resolved (was: Patch Available) Thanks Andrew for reviewing! The test failures are unrelated. {{TestWebHDFSOAuth2}} pass locally (seems to be a library loading issue). {{testWriteStripedFileWithDNFailure}} is flaky in the branch nightly Jenkins and we should fix it (as a new subtask). The findbug issues are pre-existing as well, being addressed in HDFS-8550. > Erasure coding: update EC command "-s" flag to "-p" when specifying policy > -- > > Key: HDFS-9097 > URL: https://issues.apache.org/jira/browse/HDFS-9097 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Fix For: HDFS-7285 > > Attachments: HDFS-9097-HDFS-7285.00.patch > > > HDFS-8833 missed this update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7037) Using distcp to copy data from insecure to secure cluster via hftp doesn't work (branch-2 only)
[ https://issues.apache.org/jira/browse/HDFS-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804594#comment-14804594 ] Haohui Mai commented on HDFS-7037: -- bq. adding this capability to HFTP does not change the security semantics of Hadoop at all, since RPC and other interfaces used for remote access already support allowing configurable insecure fallback Please correct me if I misunderstood. (1) The current behavior of RPC / WebHDFS is less than ideal and it is vulnerable to attack. (2) You argue that the proposed changes makes HFTP vulnerable for the fallback, but it is no worse than what we have in RPC / WebHDFS today. As an analogy, it seems to me that the argument is that it's okay to have a broken window given that we have many broken windows already? My question is that is there a need to create yet another workaround, given that we know that it is prone for security vulnerability? I'd like to understand your use cases better? Can you please elaborate why you'll need another workaround in HFTP, given that you guys have put the workaround in WebHDFS already? > Using distcp to copy data from insecure to secure cluster via hftp doesn't > work (branch-2 only) > > > Key: HDFS-7037 > URL: https://issues.apache.org/jira/browse/HDFS-7037 > Project: Hadoop HDFS > Issue Type: Bug > Components: security, tools >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Labels: BB2015-05-TBR > Attachments: HDFS-7037.001.patch > > > This is a branch-2 only issue since hftp is only supported there. > Issuing "distcp hftp:// hdfs://" gave the > following failure exception: > {code} > 14/09/13 22:07:40 INFO tools.DelegationTokenFetcher: Error when dealing > remote token: > java.io.IOException: Error when dealing remote token: Internal Server Error > at > org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.run(DelegationTokenFetcher.java:375) > at > org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.getDTfromRemote(DelegationTokenFetcher.java:238) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:252) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:247) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) > at > org.apache.hadoop.hdfs.web.HftpFileSystem.getDelegationToken(HftpFileSystem.java:247) > at > org.apache.hadoop.hdfs.web.TokenAspect.ensureTokenInitialized(TokenAspect.java:140) > at > org.apache.hadoop.hdfs.web.HftpFileSystem.addDelegationTokenParam(HftpFileSystem.java:337) > at > org.apache.hadoop.hdfs.web.HftpFileSystem.openConnection(HftpFileSystem.java:324) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:457) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:472) > at > org.apache.hadoop.hdfs.web.HftpFileSystem.getFileStatus(HftpFileSystem.java:501) > at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57) > at org.apache.hadoop.fs.Globber.glob(Globber.java:248) > at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:77) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:81) > at > org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:342) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:154) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:121) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:390) > 14/09/13 22:07:40 WARN security.UserGroupInformation: > PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) > cause:java.io.IOException: Unable to obtain remote token > 14/09/13 22:07:40 ERROR tools.DistCp: Exception encountered > java.io.IOException: Unable to obtain remote token > at > org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.getDTfromRemote(DelegationTokenFetcher.java:249) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:252) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:247) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) > at >
[jira] [Commented] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis
[ https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804604#comment-14804604 ] Hadoop QA commented on HDFS-9086: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12757162/hdfs-9086-hdfs-7285.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / ee4ee6a | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12520/console | This message was automatically generated. > Rename dfs.datanode.stripedread.threshold.millis to > dfs.datanode.stripedread.timeout.millis > --- > > Key: HDFS-9086 > URL: https://issues.apache.org/jira/browse/HDFS-9086 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: HDFS-7285 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Trivial > Attachments: hdfs-9086-hdfs-7285.001.patch > > > This config key is used to control the timeout for ECWorker reads, let's name > it with the standard term "timeout" rather than "threshold". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Attachment: HDFS-5802.004.patch Upload patch 004 to fix checkstyle warnings. The whitespace error is not from my changes, leave it for now. > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, > HDFS-5802.003.patch, HDFS-5802.004.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7037) Using distcp to copy data from insecure to secure cluster via hftp doesn't work (branch-2 only)
[ https://issues.apache.org/jira/browse/HDFS-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804611#comment-14804611 ] Aaron T. Myers commented on HDFS-7037: -- bq. Please correct me if I misunderstood. (1) The current behavior of RPC / WebHDFS is less than ideal and it is vulnerable to attack. (2) You argue that the proposed changes makes HFTP vulnerable for the fallback, but it is no worse than what we have in RPC / WebHDFS today. Correct. bq. As an analogy, it seems to me that the argument is that it's okay to have a broken window given that we have many broken windows already? I don't think that's a reasonable analogy. The point you were making is that this change introduces a possible security vulnerability. I'm saying that this is demonstrably not a security vulnerability, since we consciously chose to add this capability to other interfaces. HADOOP-11701 will make things configurably more secure for all interfaces, but that's a separate discussion. bq. My question is that is there a need to create yet another workaround, given that we know that it is prone for security vulnerability? Like I said above, this should not be considered a security vulnerability. If it is, then we should have never added this capability to WebHDFS/RPC, and we should be reverting it from WebHDFS/RPC right now. bq. I'd like to understand your use cases better? Can you please elaborate why you'll need another workaround in HFTP, given that you guys have put the workaround in WebHDFS already? Simple: because some users use HFTP and not WebHDFS, specifically for distcp from older clusters. > Using distcp to copy data from insecure to secure cluster via hftp doesn't > work (branch-2 only) > > > Key: HDFS-7037 > URL: https://issues.apache.org/jira/browse/HDFS-7037 > Project: Hadoop HDFS > Issue Type: Bug > Components: security, tools >Affects Versions: 2.6.0 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Labels: BB2015-05-TBR > Attachments: HDFS-7037.001.patch > > > This is a branch-2 only issue since hftp is only supported there. > Issuing "distcp hftp:// hdfs://" gave the > following failure exception: > {code} > 14/09/13 22:07:40 INFO tools.DelegationTokenFetcher: Error when dealing > remote token: > java.io.IOException: Error when dealing remote token: Internal Server Error > at > org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.run(DelegationTokenFetcher.java:375) > at > org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.getDTfromRemote(DelegationTokenFetcher.java:238) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:252) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:247) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) > at > org.apache.hadoop.hdfs.web.HftpFileSystem.getDelegationToken(HftpFileSystem.java:247) > at > org.apache.hadoop.hdfs.web.TokenAspect.ensureTokenInitialized(TokenAspect.java:140) > at > org.apache.hadoop.hdfs.web.HftpFileSystem.addDelegationTokenParam(HftpFileSystem.java:337) > at > org.apache.hadoop.hdfs.web.HftpFileSystem.openConnection(HftpFileSystem.java:324) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:457) > at > org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:472) > at > org.apache.hadoop.hdfs.web.HftpFileSystem.getFileStatus(HftpFileSystem.java:501) > at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57) > at org.apache.hadoop.fs.Globber.glob(Globber.java:248) > at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623) > at > org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:77) > at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:81) > at > org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:342) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:154) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:121) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:390) > 14/09/13 22:07:40 WARN security.UserGroupInformation: > PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) > cause:java.io.IOException: Unable to obtain remote token > 14/09/13 22:07:40 ERROR tools.DistCp: Exception encountered > java.io.IOException: Unable to obtain remote token > at >
[jira] [Updated] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis
[ https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9086: -- Attachment: HDFS-9086-HDFS-7285.001.patch Thanks for reviewing Zhe, attaching same patch with capitalized name, let's see if Jenkins takes it :) > Rename dfs.datanode.stripedread.threshold.millis to > dfs.datanode.stripedread.timeout.millis > --- > > Key: HDFS-9086 > URL: https://issues.apache.org/jira/browse/HDFS-9086 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: HDFS-7285 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Trivial > Attachments: HDFS-9086-HDFS-7285.001.patch, > hdfs-9086-hdfs-7285.001.patch > > > This config key is used to control the timeout for ECWorker reads, let's name > it with the standard term "timeout" rather than "threshold". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804495#comment-14804495 ] Hadoop QA commented on HDFS-5802: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 19s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 3s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 14s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 21s | The applied patch generated 3 new checkstyle issues (total was 36, now 39). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 28s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:green}+1{color} | hdfs tests | 163m 49s | Tests passed in hadoop-hdfs. | | | | 209m 58s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12757115/HDFS-5802.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 58d1a02 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12512/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12512/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12512/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12512/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12512/console | This message was automatically generated. > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, > HDFS-5802.003.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-9099) Move DistributedFileSystem to hadoop-hdfs-client
Mingliang Liu created HDFS-9099: --- Summary: Move DistributedFileSystem to hadoop-hdfs-client Key: HDFS-9099 URL: https://issues.apache.org/jira/browse/HDFS-9099 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Mingliang Liu Assignee: Mingliang Liu This jira tracks efforts of moving {{org.apache.hadoop.hdfs.DistributedFileSystem}} class from {{hadoop-hdfs}} to {{hadoop-hdfs-client}} module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8873) throttle directoryScanner
[ https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804503#comment-14804503 ] Colin Patrick McCabe commented on HDFS-8873: The jenkins errors look like: {code} java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.protocol.DatanodeInfo.(Lorg/apache/hadoop/hdfs/protocol/DatanodeID;Ljava/lang/String;ILorg/apache/hadoop/hdfs/protocol/DatanodeInfo$AdminStates;)V at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:591) {code} We've seen this before and never managed to track it down. It seems to be a bug in our Jenkins integration, possibly related to having multiple maven invocations going on at once sharing the same .m2 directory. I will re-trigger the build. bq. The whitespace error is interesting. I changed line n in the patch. Jenkins complained about the whitespace on line n+1. I fixed the whitespace on line n+1 in the next patch. Jenkins is now complaining about the whitespace on line n+2 I would say just leave it alone. If you didn't introduce the whitespace issue then don't worry about it. We really should turn off most of those checkstyle things since it provides no value. > throttle directoryScanner > - > > Key: HDFS-8873 > URL: https://issues.apache.org/jira/browse/HDFS-8873 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Daniel Templeton > Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, > HDFS-8873.003.patch > > > The new 2-level directory layout can make directory scans expensive in terms > of disk seeks (see HDFS-8791) for details. > It would be good if the directoryScanner() had a configurable duty cycle that > would reduce its impact on disk performance (much like the approach in > HDFS-8617). > Without such a throttle, disks can go 100% busy for many minutes at a time > (assuming the common case of all inodes in cache but no directory blocks > cached, 64K seeks are required for full directory listing which translates to > 655 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9080) update htrace version to 4.0
[ https://issues.apache.org/jira/browse/HDFS-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-9080: --- Attachment: HDFS-9080.004.patch > update htrace version to 4.0 > > > Key: HDFS-9080 > URL: https://issues.apache.org/jira/browse/HDFS-9080 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-9080.001.patch, HDFS-9080.002.patch, > HDFS-9080.003.patch, HDFS-9080.004.patch > > > Update the HTrace library version Hadoop uses to htrace 4.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8696) Reduce the variances of latency of WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804558#comment-14804558 ] Bob Hansen commented on HDFS-8696: -- Testing the latest patch as part of a full Hadoop build (rather than just a set of patched jars over an older Hadoop build) shows much less variance. After a warm-up period, we had >500k short requests < 1000ms and 0 at >= 1000ms. Let's call this a tentative success while we continue testing. I've reviewed the code. We can probably drop the default nio thread count down from 100 threads to the number of CPUs at a maximum. Other than that, +1. > Reduce the variances of latency of WebHDFS > -- > > Key: HDFS-8696 > URL: https://issues.apache.org/jira/browse/HDFS-8696 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 2.7.0 >Reporter: Xiaobing Zhou >Assignee: Xiaobing Zhou > Attachments: HDFS-8696.1.patch, HDFS-8696.2.patch, HDFS-8696.3.patch > > > There is an issue that appears related to the webhdfs server. When making two > concurrent requests, the DN will sometimes pause for extended periods (I've > seen 1-300 seconds), killing performance and dropping connections. > To reproduce: > 1. set up a HDFS cluster > 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing > the time out to /tmp/times.txt > {noformat} > i=1 > while (true); do > echo $i > let i++ > /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null > "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root=1"; > done > {noformat} > 3. Watch for 1-byte requests that take more than one second: > tail -F /tmp/times.txt | grep -E "^[^0]" > 4. After it has had a chance to warm up, start doing large transfers from > another shell: > {noformat} > i=1 > while (true); do > echo $i > let i++ > /usr/bin/time -f %e curl -s -L -o /dev/null > "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root"; > done > {noformat} > It's easy to find after a minute or two that small reads will sometimes > pause for 1-300 seconds. In some extreme cases, it appears that the > transfers timeout and the DN drops the connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8550) Erasure Coding: Fix FindBugs Multithreaded correctness Warning
[ https://issues.apache.org/jira/browse/HDFS-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804556#comment-14804556 ] Zhe Zhang commented on HDFS-8550: - Thanks Rakesh! The fixes look good except for the following questions / suggestions: {code} - if (b.isStriped()) { + if (b.isStriped() && b instanceof LocatedStripedBlock) { {code} Better be: {code} if (b.isStriped()) { Preconditions.checkState(b instanceof LocatedStripedBlock); } {code} {{int bufOffset = (int) (rangeStartInBlockGroup % ((long) cellSize * dataBlkNum));}}: should it be {{(long)(cellSize * dataBlkNum)}}? {{synchronized (DFSStripedInputStream.this)}} maybe {{synchronized (curStripeBuf)}} is more explicit? > Erasure Coding: Fix FindBugs Multithreaded correctness Warning > -- > > Key: HDFS-8550 > URL: https://issues.apache.org/jira/browse/HDFS-8550 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Rakesh R >Assignee: Rakesh R > Attachments: HDFS-8550-HDFS-7285-00.patch, > HDFS-8550-HDFS-7285-01.patch > > > Please find the findbug warnings > [report|https://builds.apache.org/job/PreCommit-HDFS-Build/12444/artifact/patchprocess/patchFindbugsWarningshadoop-hdfs.html] > 1) {code} > Bug type IS2_INCONSISTENT_SYNC (click for details) > In class org.apache.hadoop.hdfs.DFSStripedInputStream > Field org.apache.hadoop.hdfs.DFSStripedInputStream.curStripeBuf > Synchronized 90% of the time > Unsynchronized access at DFSStripedInputStream.java:[line 829] > Synchronized access at DFSStripedInputStream.java:[line 183] > Synchronized access at DFSStripedInputStream.java:[line 186] > Synchronized access at DFSStripedInputStream.java:[line 184] > Synchronized access at DFSStripedInputStream.java:[line 382] > Synchronized access at DFSStripedInputStream.java:[line 460] > Synchronized access at DFSStripedInputStream.java:[line 461] > Synchronized access at DFSStripedInputStream.java:[line 461] > Synchronized access at DFSStripedInputStream.java:[line 285] > Synchronized access at DFSStripedInputStream.java:[line 297] > Synchronized access at DFSStripedInputStream.java:[line 298] > {code} > 2) > {code} > Unread field: > org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock > Bug type URF_UNREAD_FIELD (click for details) > In class org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo > Field org.apache.hadoop.hdfs.DFSStripedInputStream$BlockReaderInfo.targetBlock > At DFSStripedInputStream.java:[line 126] > {code} > 3) > {code} > Unchecked/unconfirmed cast from org.apache.hadoop.hdfs.protocol.LocatedBlock > to org.apache.hadoop.hdfs.protocol.LocatedStripedBlock in > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.setBlockToken(LocatedBlock, > BlockTokenIdentifier$AccessMode) > Bug type BC_UNCONFIRMED_CAST (click for details) > In class org.apache.hadoop.hdfs.server.blockmanagement.BlockManager > In method > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.setBlockToken(LocatedBlock, > BlockTokenIdentifier$AccessMode) > Actual type org.apache.hadoop.hdfs.protocol.LocatedBlock > Expected org.apache.hadoop.hdfs.protocol.LocatedStripedBlock > Value loaded from b > At BlockManager.java:[line 974] > {code} > 4) > {code} > Result of integer multiplication cast to long in > org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(ErasureCodingPolicy, > int, LocatedStripedBlock, long, long, ByteBuffer) > Bug type ICAST_INTEGER_MULTIPLY_CAST_TO_LONG (click for details) > In class org.apache.hadoop.hdfs.util.StripedBlockUtil > In method > org.apache.hadoop.hdfs.util.StripedBlockUtil.divideOneStripe(ErasureCodingPolicy, > int, LocatedStripedBlock, long, long, ByteBuffer) > At StripedBlockUtil.java:[line 375] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9004) Add upgrade domain to DatanodeInfo
[ https://issues.apache.org/jira/browse/HDFS-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804634#comment-14804634 ] Lei (Eddy) Xu commented on HDFS-9004: - +1 pending jenkins. > Add upgrade domain to DatanodeInfo > -- > > Key: HDFS-9004 > URL: https://issues.apache.org/jira/browse/HDFS-9004 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9004-2.patch, HDFS-9004-3.patch, HDFS-9004.patch > > > As part of upgrade domain feature, we first need to add upgrade domain string > to {{DatanodeInfo}}. It includes things like: > * Add a new field to DatanodeInfo. > * Modify protobuf for DatanodeInfo. > * Update DatanodeInfo.getDatanodeReport to include upgrade domain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804651#comment-14804651 ] Hudson commented on HDFS-9022: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2350 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2350/]) HDFS-9022. Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 9eee97508f350ed4629abb04e7781514ffa04070) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeRollingUpgrade.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatus.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShellGenericOptions.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDefaultNameNodePort.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DfsServlet.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/IPFailoverProxyProvider.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientFailover.java > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, > HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-9022: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the contribution. > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, > HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-9022: - Component/s: (was: namenode) > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, > HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis
[ https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9086: -- Status: Patch Available (was: Open) > Rename dfs.datanode.stripedread.threshold.millis to > dfs.datanode.stripedread.timeout.millis > --- > > Key: HDFS-9086 > URL: https://issues.apache.org/jira/browse/HDFS-9086 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: HDFS-7285 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Trivial > Attachments: hdfs-9086-hdfs-7285.001.patch > > > This config key is used to control the timeout for ECWorker reads, let's name > it with the standard term "timeout" rather than "threshold". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9097) Erasure coding: update EC command "-s" flag to "-p" when specifying policy
[ https://issues.apache.org/jira/browse/HDFS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804539#comment-14804539 ] Hadoop QA commented on HDFS-9097: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 53s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 52s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 43s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 14s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | site | 2m 57s | Site still builds. | | {color:green}+1{color} | checkstyle | 0m 31s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 30s | The patch appears to introduce 4 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 5s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 186m 30s | Tests failed in hadoop-hdfs. | | | | 234m 25s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.web.TestWebHDFSOAuth2 | | | hadoop.hdfs.TestWriteStripedFileWithFailure | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12757113/HDFS-9097-HDFS-7285.00.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | HDFS-7285 / e36129b | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12513/artifact/patchprocess/patchReleaseAuditProblems.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/12513/artifact/patchprocess/whitespace.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12513/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12513/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12513/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12513/console | This message was automatically generated. > Erasure coding: update EC command "-s" flag to "-p" when specifying policy > -- > > Key: HDFS-9097 > URL: https://issues.apache.org/jira/browse/HDFS-9097 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7285 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-9097-HDFS-7285.00.patch > > > HDFS-8833 missed this update. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8855) Webhdfs client leaks active NameNode connections
[ https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804547#comment-14804547 ] Owen O'Malley commented on HDFS-8855: - A few points: * You need to use the Token.getKind(), Token.getIdentifier(), and Token.getPassword() as the key for the cache. The patch currently uses Token.toString, which uses the identifier, kind, and service. The service is set by the client so it shouldn't be part of the match. The password on the other hand must be part of the match so that guessing the identifier doesn't allow a hacker to impersonate the user. * The timeout should default to 10 minutes instead of 10 seconds. * Please fix the checkstyle and findbugs warnings. * Determine what is wrong with the test case. Other than that, it looks good. > Webhdfs client leaks active NameNode connections > > > Key: HDFS-8855 > URL: https://issues.apache.org/jira/browse/HDFS-8855 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Reporter: Bob Hansen >Assignee: Xiaobing Zhou > Attachments: HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, > HDFS-8855.4.patch, HDFS_8855.prototype.patch > > > The attached script simulates a process opening ~50 files via webhdfs and > performing random reads. Note that there are at most 50 concurrent reads, > and all webhdfs sessions are kept open. Each read is ~64k at a random > position. > The script periodically (once per second) shells into the NameNode and > produces a summary of the socket states. For my test cluster with 5 nodes, > it took ~30 seconds for the NameNode to have ~25000 active connections and > fails. > It appears that each request to the webhdfs client is opening a new > connection to the NameNode and keeping it open after the request is complete. > If the process continues to run, eventually (~30-60 seconds), all of the > open connections are closed and the NameNode recovers. > This smells like SoftReference reaping. Are we using SoftReferences in the > webhdfs client to cache NameNode connections but never re-using them? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9022) Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804564#comment-14804564 ] Hudson commented on HDFS-9022: -- FAILURE: Integrated in Hadoop-trunk-Commit #8472 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8472/]) HDFS-9022. Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 9eee97508f350ed4629abb04e7781514ffa04070) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDefaultNameNodePort.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShellGenericOptions.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/DfsServlet.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileStatus.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientFailover.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java * hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/DFSClientCache.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeRollingUpgrade.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/IPFailoverProxyProvider.java > Move NameNode.getAddress() and NameNode.getUri() to hadoop-hdfs-client > -- > > Key: HDFS-9022 > URL: https://issues.apache.org/jira/browse/HDFS-9022 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Fix For: 2.8.0 > > Attachments: HDFS-9022.000.patch, HDFS-9022.001.patch, > HDFS-9022.002.patch, HDFS-9022.003.patch, HDFS-9022.004.patch > > > The static helper methods in NameNodes are used in {{hdfs-client}} module. > For example, it's used by the {{DFSClient}} and {{NameNodeProxies}} classes > which are being moved to {{hadoop-hdfs-client}} module. Meanwhile, we should > keep the {{NameNode}} class itself in the {{hadoop-hdfs}} module. > This jira tracks the effort of moving the following static helper methods out > of {{NameNode}} and thus {{hadoop-hdfs}} module. A good place to put these > methods is the {{DFSUtilClient}} class: > {code} > public static InetSocketAddress getAddress(String address); > public static InetSocketAddress getAddress(Configuration conf); > public static InetSocketAddress getAddress(URI filesystemURI); > public static URI getUri(InetSocketAddress namenode); > {code} > Be cautious not to bring new checkstyle warnings. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9086) Rename dfs.datanode.stripedread.threshold.millis to dfs.datanode.stripedread.timeout.millis
[ https://issues.apache.org/jira/browse/HDFS-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804608#comment-14804608 ] Zhe Zhang commented on HDFS-9086: - Thanks Andrew, the patch LGTM. I think Jenkins tried applying it on trunk, maybe because of capitalization in patch name. > Rename dfs.datanode.stripedread.threshold.millis to > dfs.datanode.stripedread.timeout.millis > --- > > Key: HDFS-9086 > URL: https://issues.apache.org/jira/browse/HDFS-9086 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: HDFS-7285 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Trivial > Attachments: hdfs-9086-hdfs-7285.001.patch > > > This config key is used to control the timeout for ECWorker reads, let's name > it with the standard term "timeout" rather than "threshold". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9004) Add upgrade domain to DatanodeInfo
[ https://issues.apache.org/jira/browse/HDFS-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9004: -- Attachment: HDFS-9004-3.patch Thanks [~eddyxu]! Here is the updated patch that addresses your comment. The failed unit tests aren't related. > Add upgrade domain to DatanodeInfo > -- > > Key: HDFS-9004 > URL: https://issues.apache.org/jira/browse/HDFS-9004 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9004-2.patch, HDFS-9004-3.patch, HDFS-9004.patch > > > As part of upgrade domain feature, we first need to add upgrade domain string > to {{DatanodeInfo}}. It includes things like: > * Add a new field to DatanodeInfo. > * Modify protobuf for DatanodeInfo. > * Update DatanodeInfo.getDatanodeReport to include upgrade domain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5802) NameNode does not check for inode type before traversing down a path
[ https://issues.apache.org/jira/browse/HDFS-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-5802: Status: Patch Available (was: Open) > NameNode does not check for inode type before traversing down a path > > > Key: HDFS-5802 > URL: https://issues.apache.org/jira/browse/HDFS-5802 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Xiao Chen >Priority: Trivial > Attachments: HDFS-5802.001.patch, HDFS-5802.002.patch, > HDFS-5802.003.patch, HDFS-5802.004.patch > > > This came up during the discussion on a forum at > http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Permission-denied-access-EXECUTE-on-getting-the-status-of-a-file/m-p/5049#M162 > surrounding an fs.exists(…) check running on a path /foo/bar, where /foo is > a file and not a directory. > In such a case, NameNode yields a user-confusing message of {{Permission > denied: user=foo, access=EXECUTE, inode="/foo":foo:foo:-rw-r--r--}} instead > of clearly saying (and realising) "/foo is not a directory" or "/foo is a > file" before it tries to traverse further down to locate the requested path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9092) Nfs silently drops overlapping write requests, thus data copying can't complete
[ https://issues.apache.org/jira/browse/HDFS-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804667#comment-14804667 ] Brandon Li commented on HDFS-9092: -- Thank you, [~yzhangal] for the patch. Could you roughly describe the idea of the fix? > Nfs silently drops overlapping write requests, thus data copying can't > complete > --- > > Key: HDFS-9092 > URL: https://issues.apache.org/jira/browse/HDFS-9092 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs >Affects Versions: 2.7.1 >Reporter: Yongjun Zhang >Assignee: Yongjun Zhang > Attachments: HDFS-9092.001.patch > > > When NOT using 'sync' option, the NFS writes may issue the following warning: > org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write > (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now > and the size of data copied via NFS will stay at 1248752400. > Found what happened is: > 1. The write requests from client are sent asynchronously. > 2. The NFS gateway has handler to handle the incoming requests by creating an > internal write request structuire and put it into cache; > 3. In parallel, a separate thread in NFS gateway takes requests out from the > cache and writes the data to HDFS. > The current offset is how much data has been written by the write thread in > 3. The detection of overlapping write request happens in 2, but it only > checks the write request against the curent offset, and trim the request if > necessary. Because the write requests are sent asynchronously, if two > requests are beyond the current offset, and they overlap, it's not detected > and both are put into the cache. This cause the symptom reported in this case > at step 3. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9040) Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests to Coordinator)
[ https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803339#comment-14803339 ] Jing Zhao commented on HDFS-9040: - I think the key point here is to bump the GS, which is necessary to identify stale/corrupted internal blocks. For example, when writing the last stripe, suppose the last data block fails. Only based on internal block lengths we cannot identify the failure. Later when we suppose hflush/hsync, we have to use (GS+block group size) to identify the correct parity blocks. But maybe the NN does not need a strong correctness guarantee for the expected replica list. Block location information can be corrected finally based on full/incremental block reports. > Erasure coding: Refactor DFSStripedOutputStream (Move Namenode RPC Requests > to Coordinator) > --- > > Key: HDFS-9040 > URL: https://issues.apache.org/jira/browse/HDFS-9040 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Walter Su > Attachments: HDFS-9040-HDFS-7285.002.patch, > HDFS-9040-HDFS-7285.003.patch, HDFS-9040.00.patch, HDFS-9040.001.wip.patch, > HDFS-9040.02.bgstreamer.patch > > > The general idea is to simplify error handling logic. > Proposal 1: > A BlockGroupDataStreamer to communicate with NN to allocate/update block, and > StripedDataStreamer s only have to stream blocks to DNs. > Proposal 2: > See below the > [comment|https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741388=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741388] > from [~jingzhao]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)