[jira] [Updated] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7677: - Fix Version/s: (was: 2.7.0) Status: Patch Available (was: Open) DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7224) Allow reuse of NN connections via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-7224: - Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Allow reuse of NN connections via webhdfs - Key: HDFS-7224 URL: https://issues.apache.org/jira/browse/HDFS-7224 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 2.7.0 Attachments: HDFS-7224.v1.201410301923.txt, HDFS-7224.v2.201410312033.txt, HDFS-7224.v3.txt, HDFS-7224.v4.txt In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a huge amount of files. WebHdfsFileSystem#jsonParse gets the input/error stream from the connection, but never closes the stream. Since it's not closed, the JVM thinks the stream may still be transferring data, so the next time through this code, it has to get a new connection rather than reusing an existing one. The lack of connection reuse has poor latency and adds too much overhead to the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7224) Allow reuse of NN connections via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291862#comment-14291862 ] Kihwal Lee commented on HDFS-7224: -- I've committed this to branch-2 and trunk. Thanks for working on this, Eric, and for the review, Daryn. Allow reuse of NN connections via webhdfs - Key: HDFS-7224 URL: https://issues.apache.org/jira/browse/HDFS-7224 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 2.7.0 Attachments: HDFS-7224.v1.201410301923.txt, HDFS-7224.v2.201410312033.txt, HDFS-7224.v3.txt, HDFS-7224.v4.txt In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a huge amount of files. WebHdfsFileSystem#jsonParse gets the input/error stream from the connection, but never closes the stream. Since it's not closed, the JVM thinks the stream may still be transferring data, so the next time through this code, it has to get a new connection rather than reusing an existing one. The lack of connection reuse has poor latency and adds too much overhead to the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7224) Allow reuse of NN connections via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291875#comment-14291875 ] Hudson commented on HDFS-7224: -- SUCCESS: Integrated in Hadoop-trunk-Commit #6930 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6930/]) HDFS-7224. Allow reuse of NN connections via webhdfs. Contributed by Eric Payne (kihwal: rev 2b0fa20f69417326a92beac10ffa072db2616e73) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestFSMainOperationsWebHdfs.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java Allow reuse of NN connections via webhdfs - Key: HDFS-7224 URL: https://issues.apache.org/jira/browse/HDFS-7224 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 2.7.0 Attachments: HDFS-7224.v1.201410301923.txt, HDFS-7224.v2.201410312033.txt, HDFS-7224.v3.txt, HDFS-7224.v4.txt In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a huge amount of files. WebHdfsFileSystem#jsonParse gets the input/error stream from the connection, but never closes the stream. Since it's not closed, the JVM thinks the stream may still be transferring data, so the next time through this code, it has to get a new connection rather than reusing an existing one. The lack of connection reuse has poor latency and adds too much overhead to the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7577) Add additional headers that includes need by Windows
[ https://issues.apache.org/jira/browse/HDFS-7577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291908#comment-14291908 ] Thanh Do commented on HDFS-7577: Hi [~cmccabe]. Could please you take a look at the new patch? I really like to get this in so that I can start the next patch, which depends on this one. Thank you. Add additional headers that includes need by Windows Key: HDFS-7577 URL: https://issues.apache.org/jira/browse/HDFS-7577 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Thanh Do Assignee: Thanh Do Attachments: HDFS-7577-branch-HDFS-6994-0.patch, HDFS-7577-branch-HDFS-6994-1.patch This jira involves adding a list of (mostly dummy) headers that available in POSIX systems, but not in Windows. One step towards making libhdfs3 built in Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7677: - Attachment: HDFS-7677.001.patch DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292001#comment-14292001 ] Hadoop QA commented on HDFS-7677: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694546/HDFS-7677.001.patch against trunk revision 7b82c4a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.datanode.TestBlockScanner Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9328//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9328//console This message is automatically generated. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7676) Fix TestFileTruncate to avoid bug of HDFS-7611
[ https://issues.apache.org/jira/browse/HDFS-7676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292314#comment-14292314 ] Colin Patrick McCabe commented on HDFS-7676: [~shv], are this JIRA and HDFS-7654 duplicates? Should we close HDFS-7654? Fix TestFileTruncate to avoid bug of HDFS-7611 -- Key: HDFS-7676 URL: https://issues.apache.org/jira/browse/HDFS-7676 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 2.7.0 Attachments: HDFS-7676.patch This is to fix testTruncateEditLogLoad(), which is failing due to the bug described in HDFS-7611. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7312) Update DistCp v1 to optionally not use tmp location
[ https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292324#comment-14292324 ] Hadoop QA commented on HDFS-7312: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12683251/HDFS-7312.007.patch against trunk revision 21d5599. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9329//console This message is automatically generated. Update DistCp v1 to optionally not use tmp location --- Key: HDFS-7312 URL: https://issues.apache.org/jira/browse/HDFS-7312 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 2.5.1 Reporter: Joseph Prosser Assignee: Joseph Prosser Priority: Minor Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.patch Original Estimate: 72h Remaining Estimate: 72h DistCp v1 currently copies files to a tmp location and then renames that to the specified destination. This can cause performance issues on filesystems such as S3. A -skiptmp flag will be added to bypass this step and copy directly to the destination. This feature mirrors a similar one added to HBase ExportSnapshot [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7665) Add definition of truncate preconditions/postconditions to filesystem specification
[ https://issues.apache.org/jira/browse/HDFS-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292280#comment-14292280 ] Konstantin Shvachko commented on HDFS-7665: --- Makes sense. Do you want to implement this? Seems like you had experience. Add definition of truncate preconditions/postconditions to filesystem specification --- Key: HDFS-7665 URL: https://issues.apache.org/jira/browse/HDFS-7665 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Steve Loughran Fix For: 3.0.0 With the addition of a major new feature to filesystems, the filesystem specification in hadoop-common/site is now out of sync. This means that # there's no strict specification of what it should do # you can't derive tests from that specification # other people trying to implement the API will have to infer what to do from the HDFS source # there's no way to decide whether or not the HDFS implementation does what it is intended. # without matching tests against the raw local FS, differences between the HDFS impl and the Posix standard one won't be caught until it is potentially too late to fix. The operation should be relatively easy to define (after a truncate, the files bytes [0...len-1] must equal the original bytes, length(file)==len, etc) The truncate tests already written could then be pulled up into contract tests which any filesystem implementation can run against. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292301#comment-14292301 ] Andrew Wang commented on HDFS-7411: --- This is the same testIncludeByRegistrationName failure in HDFS-7527. I think my test improvements make it fail reliably instead of flakily (which is correct, since it's broken), but maybe this should be @Ignore'd in this patch. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292308#comment-14292308 ] Colin Patrick McCabe commented on HDFS-7675: It's not redundant. Calling {{SpanReceiverHost#getInstance}} initializes {{SpanReceiverHost#SingletonHolder#INSTANCE}}, which is needed for tracing to operate. This code was taken from HBase, which has a similar construct for tracing. I agree that we could probably skip storing the {{SpanReceiverHost}} inside {{DFSClient}}, as long as there was a comment about why we were calling {{getInstance}}. Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292295#comment-14292295 ] Tsz Wo Nicholas Sze commented on HDFS-7339: --- Are you proposing we enforce 4 zero bits for all blocks, striped or regular? No, only for the first block in a block group. We have a few choices: # Allocate IDs for normal blocks as usual. Allocate *consecutive IDs for blocks in a group*. #* e.g. allocateBlock - 0x301, allocateBlock - 0x302, allocateBlockGroup - 0x303..0x30B, allocateBlockGroup - 0x30C..0x314, allocateBlock - 0x315, ... #* Since the block IDs in a block group could cross low 4-bit boundary, BlocksMap lookup need to be executed twice. e.g. for 0x312, first try lookup(0x312) which returns null, and then try lookup(0x30F) which returns the block group with 0x30C first block. #* *All lookup need to be executed twice!* It does not seems a good solution (so that I did not mention it previously.) # Allocate IDs for normal blocks as usual. Allocate consecutive IDs for blocks in a group and *skip to next zero if it crosses the low 4-bit boundary*. #* e.g. allocateBlock - 0x301, allocateBlock - 0x302, allocateBlockGroup - 0x303..0x30B, allocateBlockGroup - 0x310..0x318, allocateBlock - 0x319, ... #* Only one lookup is needed. # Allocate IDs for normal blocks as usual. Allocate consecutive IDs for blocks in a group and *always skip to next zero low 4-bit*. #* e.g. allocateBlock - 0x301, allocateBlock - 0x302, allocateBlockGroup - 0x310..0x308, allocateBlockGroup - 0x320..0x328, allocateBlock - 0x329, ... #* The low 4-bit of the first block ID in block group is always zero. # Allocate IDs for normal blocks and *skip if the low 4-bit is zero*. Allocate consecutive IDs for blocks in a group and always skip to next zero low 4-bit. #* e.g. allocateBlock - 0x301, allocateBlock - 0x302, allocateBlockGroup - 0x310..0x308, allocateBlockGroup - 0x320..0x328, allocateBlock - 0x329, allocateBlock - 0x32A, allocateBlock - 0x32B, allocateBlock - 0x32C, allocateBlock - 0x32D, allocateBlock - 0x32E, allocateBlock - 0x32F, *allocateBlock - 0x331*, ... #* If the low 4-bit of an ID is zero, it must be the first block in a block group. # Allocate IDs for normal blocks, skip if the low 4-bit is zero and *skip to next low 4-bit if the previous allocation is for a block group*. Allocate consecutive IDs for blocks in a group and always skip to next zero low 4-bit. #* e.g. allocateBlock - 0x301, allocateBlock - 0x302, allocateBlockGroup - 0x310..0x308, allocateBlockGroup - 0x320..0x328, *allocateBlock - 0x331*, ... #* Normal blocks and blocks in block group cannot share the same high 60-bit prefix. # The same as before except that *do not skip if the low 4-bit is zero*. #* e.g. allocateBlock - 0x301, allocateBlock - 0x302, allocateBlockGroup - 0x310..0x308, allocateBlockGroup - 0x320..0x328, *allocateBlock - 0x330*, ... #* Normal block ID could have zero low 4-bit. Which one is the best? Allocating and persisting block groups in NameNode -- Key: HDFS-7339 URL: https://issues.apache.org/jira/browse/HDFS-7339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg All erasure codec operations center around the concept of _block group_; they are formed in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}} is created to record the original and parity blocks in a coding group, as well as a pointer to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes, with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs) is added, which remains empty for “traditional” HDFS files with contiguous block layout. The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}} component; the attached figure has an illustration of the architecture. As a simple example, when a {_Striping+EC_} file is created and written to, it will serve requests from the client to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, {{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA
[jira] [Commented] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292298#comment-14292298 ] Konstantin Shvachko commented on HDFS-7677: --- I don't see it in my design, which is this one https://issues.apache.org/jira/secure/attachment/12679075/HDFS_truncate.pdf I don't see reasons not supporting it either. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7312) Update DistCp v1 to optionally not use tmp location
[ https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Prosser updated HDFS-7312: - Status: Patch Available (was: Open) Update DistCp v1 to optionally not use tmp location --- Key: HDFS-7312 URL: https://issues.apache.org/jira/browse/HDFS-7312 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 2.5.1 Reporter: Joseph Prosser Assignee: Joseph Prosser Priority: Minor Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, HDFS-7312.006.patch, HDFS-7312.007.patch, HDFS-7312.patch Original Estimate: 72h Remaining Estimate: 72h DistCp v1 currently copies files to a tmp location and then renames that to the specified destination. This can cause performance issues on filesystems such as S3. A -skiptmp flag will be added to bypass this step and copy directly to the destination. This feature mirrors a similar one added to HBase ExportSnapshot [HBASE-9|https://issues.apache.org/jira/browse/HBASE-9] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7680) Support dataset-specific choice of short circuit implementation
Joe Pallas created HDFS-7680: Summary: Support dataset-specific choice of short circuit implementation Key: HDFS-7680 URL: https://issues.apache.org/jira/browse/HDFS-7680 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, dfsclient, hdfs-client Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas As described in HDFS-5194, the current support for short circuit reading is tightly coupled to the default Dataset implementation. Since alternative implementations of the FsDatasetSpi may use a different short circuit pathway, there needs to be a way for the client to acquire the right kind of BlockReader. Reviewing some considerations: Today, there is only one dataset per datanode (with multiple volumes). Is that likely to change? Can there be multiple datanodes local to a client? Is it okay to assume that the client and datanode share configuration? More broadly, how should the client discover the appropriate short-circuit implementation? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7666) Datanode blockId layout upgrade threads should be daemon thread
[ https://issues.apache.org/jira/browse/HDFS-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292273#comment-14292273 ] Colin Patrick McCabe commented on HDFS-7666: Hi Rakesh, I'm not sure why we would want the blockID upgrade threads to be daemon threads. Daemon threads don't block the JVM from exiting if they are the only remaining threads. But we don't expect the JVM to while an upgrade is still incomplete. In fact, if it does, we are in big trouble. Datanode blockId layout upgrade threads should be daemon thread --- Key: HDFS-7666 URL: https://issues.apache.org/jira/browse/HDFS-7666 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-7666-v1.patch This jira is to mark the layout upgrade thread as daemon thread. {code} int numLinkWorkers = datanode.getConf().getInt( DFSConfigKeys.DFS_DATANODE_BLOCK_ID_LAYOUT_UPGRADE_THREADS_KEY, DFSConfigKeys.DFS_DATANODE_BLOCK_ID_LAYOUT_UPGRADE_THREADS); ExecutorService linkWorkers = Executors.newFixedThreadPool(numLinkWorkers); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292311#comment-14292311 ] Colin Patrick McCabe commented on HDFS-7677: I think we should support this, just for consistency with the other operations. But keep in mind that symlinks are hard-disabled in branch-2 because of security and other issues, and that's probably not going to change any time soon. See HADOOP-10019 for details. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7681) Fix ReplicaInputStream constructor to take InputStreams
Joe Pallas created HDFS-7681: Summary: Fix ReplicaInputStream constructor to take InputStreams Key: HDFS-7681 URL: https://issues.apache.org/jira/browse/HDFS-7681 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas As noted in HDFS-5194, the constructor for {{ReplicaInputStream}} takes {{FileDescriptor}} s that are immediately turned into {{InputStream}} s, while the callers already have {{FileInputStream}} s from which they extract {{FileDescriptor}} s. This seems to have been done as part of a large set of changes to appease findbugs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7630) TestConnCache hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292353#comment-14292353 ] Arpit Agarwal commented on HDFS-7630: - HI [~sam liu], it is hard to review and commit many trivial patches. It would really help if HDFS-7624 through HDFS-7630 could be combined into a single Jira with a consolidated patch. I think the change looks fine, I'd like to verify it on Windows before I +1. TestConnCache hardcode block size without considering native OS --- Key: HDFS-7630 URL: https://issues.apache.org/jira/browse/HDFS-7630 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7630.001.patch TestConnCache hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292174#comment-14292174 ] Zhe Zhang commented on HDFS-7339: - bq. If we enforce block id allocation so that the lower 4-bit of the first ID must be zeros, then it is very similar to the scheme propused in the design doc except there is no notation of block group in the block IDs. [~szetszwo] Are you proposing we enforce 4 zero bits for _all_ blocks, striped or regular? It seems to shrink the usable ID space by quite a lot. I think we can still have 2 separate ID generators, both generating regular block IDs. This way we are also getting rid of block group IDs. Thanks again for the in-depth review and advice. Allocating and persisting block groups in NameNode -- Key: HDFS-7339 URL: https://issues.apache.org/jira/browse/HDFS-7339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg All erasure codec operations center around the concept of _block group_; they are formed in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}} is created to record the original and parity blocks in a coding group, as well as a pointer to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes, with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs) is added, which remains empty for “traditional” HDFS files with contiguous block layout. The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}} component; the attached figure has an illustration of the architecture. As a simple example, when a {_Striping+EC_} file is created and written to, it will serve requests from the client to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, {{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7659) We should check the new length of truncate can't be a negative value.
[ https://issues.apache.org/jira/browse/HDFS-7659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292210#comment-14292210 ] Konstantin Shvachko commented on HDFS-7659: --- No problem. I was just trying to find the most effective way to get it done. We should check the new length of truncate can't be a negative value. - Key: HDFS-7659 URL: https://issues.apache.org/jira/browse/HDFS-7659 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7659-branch2.patch, HDFS-7659.001.patch, HDFS-7659.002.patch, HDFS-7659.003.patch It's obvious that we should check the new length of truncate can't be a negative value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7682: --- Status: Patch Available (was: Open) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7577) Add additional headers that includes need by Windows
[ https://issues.apache.org/jira/browse/HDFS-7577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292414#comment-14292414 ] Colin Patrick McCabe commented on HDFS-7577: cpuid.h: we need an #error when this isn't x86, to explain why compilation is failing (due to the unimplemented function) +1 once that's resolved Add additional headers that includes need by Windows Key: HDFS-7577 URL: https://issues.apache.org/jira/browse/HDFS-7577 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Thanh Do Assignee: Thanh Do Attachments: HDFS-7577-branch-HDFS-6994-0.patch, HDFS-7577-branch-HDFS-6994-1.patch This jira involves adding a list of (mostly dummy) headers that available in POSIX systems, but not in Windows. One step towards making libhdfs3 built in Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7648) Verify the datanode directory layout
[ https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292443#comment-14292443 ] Colin Patrick McCabe commented on HDFS-7648: I agree with Nicholas that we should implement this in the DirectoryScanner. Probably what we want to do is log a warning about files that are in locations they do not belong in. I do not think we should to implement this in the layout version upgrade code, since that only gets run once. The fact that hardlinking is done in parallel in the upgrade code should not be relevant for implementing this. Verify the datanode directory layout Key: HDFS-7648 URL: https://issues.apache.org/jira/browse/HDFS-7648 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze HDFS-6482 changed datanode layout to use block ID to determine the directory to store the block. We should have some mechanism to verify it. Either DirectoryScanner or block report generation could do the check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292424#comment-14292424 ] Colin Patrick McCabe commented on HDFS-6673: bq. The whole point of this tool is to run the oiv on machines that do not have the luxury of abundant memory. Can you clarify what point you are trying to make? I think that the point that Andrew is trying to make is that this tool will run quickly on machines with more memory, while still being possible to use on machines with less memory. bq. Can you clarify what the greater functionality are? The Delimiter only outputs mtime/atime and other information available from legacy fsimage. Andrew already commented that HDFS-6293 doesn't include metadata newer than the old format. As we add more features over time, printing data from the legacy fsimage will become less and less useful. Eventually we will probably want to drop support entirely, perhaps in Hadoop 3.0. There is a maintenance burden associated with maintaining two image formats. Add Delimited format supports for PB OIV tool - Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7611: Attachment: HDFS-7611.000.patch Upload the patch using the above #2 approach. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Byron Wong Priority: Critical Attachments: HDFS-7611.000.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-7411: -- Attachment: hdfs-7411.009.patch Added the @Ignore in this latest patch, no other changes. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292496#comment-14292496 ] Zhe Zhang commented on HDFS-7339: - Thanks for the analysis [~szetszwo]. The basic tradeoff is the compactness of ID space versus lookup overhead. I agree option #1 should be ruled out (most compact allocation, slowest lookup). From options #2~#5 the trend is sparser ID allocation; more invariants are guaranteed as a benefit. However, it seems all of them require an additional lookup (either in {{blocksMap}} or in the map of inodes) to identify a non-EC block? For example, when a block report for *0x331* arrives, we don't know if it's a non-EC block, or an EC block in the group *0x330*. So we must lookup {{blocksMap}} for *0x330* and get a miss or find the inode and obtain the storage policy. Whereas separating the ID space with a binary flag leads to 1 lookup (except for legacy, randomly generated block IDs). Allocating and persisting block groups in NameNode -- Key: HDFS-7339 URL: https://issues.apache.org/jira/browse/HDFS-7339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg All erasure codec operations center around the concept of _block group_; they are formed in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}} is created to record the original and parity blocks in a coding group, as well as a pointer to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes, with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs) is added, which remains empty for “traditional” HDFS files with contiguous block layout. The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}} component; the attached figure has an illustration of the architecture. As a simple example, when a {_Striping+EC_} file is created and written to, it will serve requests from the client to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, {{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-7682: --- Attachment: HDFS-7682.000.patch Posting patch for a jenkins run. {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7341) Add initial snapshot support based on pipeline recovery
[ https://issues.apache.org/jira/browse/HDFS-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HDFS-7341. Resolution: Duplicate Duplicate of HDFS-3107 Add initial snapshot support based on pipeline recovery --- Key: HDFS-7341 URL: https://issues.apache.org/jira/browse/HDFS-7341 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Colin Patrick McCabe Attachments: HDFS-3107_Nov3.patch, editsStored_Nov3, editsStored_Nov3.xml Add initial snapshot support based on pipeline recovery. This iteration does not support snapshots or rollback. This support will be added in the HDFS-3107 branch by later subtasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7577) Add additional headers that includes need by Windows
[ https://issues.apache.org/jira/browse/HDFS-7577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thanh Do updated HDFS-7577: --- Attachment: HDFS-7577-branch-HDFS-6994-2.patch Hi [~cmccabe]. Attached is another patch that throws an error if libhdfs3 is compiled in Windows with a non-x86 processor. Add additional headers that includes need by Windows Key: HDFS-7577 URL: https://issues.apache.org/jira/browse/HDFS-7577 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Thanh Do Assignee: Thanh Do Attachments: HDFS-7577-branch-HDFS-6994-0.patch, HDFS-7577-branch-HDFS-6994-1.patch, HDFS-7577-branch-HDFS-6994-2.patch This jira involves adding a list of (mostly dummy) headers that available in POSIX systems, but not in Windows. One step towards making libhdfs3 built in Windows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
Charles Lamb created HDFS-7682: -- Summary: {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292447#comment-14292447 ] Tsz Wo Nicholas Sze commented on HDFS-3107: --- Thanks a lot for all the great works! HDFS truncate - Key: HDFS-3107 URL: https://issues.apache.org/jira/browse/HDFS-3107 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Lei Chang Assignee: Plamen Jeliazkov Fix For: 2.7.0 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml Original Estimate: 1,344h Remaining Estimate: 1,344h Systems with transaction support often need to undo changes made to the underlying storage when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix operation) which is a reverse operation of append, which makes upper layer applications use ugly workarounds (such as keeping track of the discarded byte range per file in a separate metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292454#comment-14292454 ] Jing Zhao commented on HDFS-7611: - Thanks for digging into the issue, [~Byron Wong]! So currently we have two ways to fix the issue: # While applying the editlog, instead of calling {{INode#addSpaceConsumed}}, we should use {{FSDirectory#updateCount}} which checks if image/editlog has been loaded. # We do not compute quota change and update quota usage in {{FSDirectory#removeLastINode}} anymore. Instead, we move the quota computation/update part to its caller. In this way, the quota usage change, even if it's wrong, will not affect the real deletion. Both changes actually are necessary. But #1 requires a lot of code refactoring. Since #2 alone can also fix the reported bug, I guess we can do #1 in a separate jira. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Byron Wong Priority: Critical Attachments: blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7018) Implement C interface for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhanwei Wang updated HDFS-7018: --- Attachment: (was: HDFS-7018-pnative.004.patch) Implement C interface for libhdfs3 -- Key: HDFS-7018 URL: https://issues.apache.org/jira/browse/HDFS-7018 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018-pnative.004.patch, HDFS-7018.patch Implement C interface for libhdfs3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7018) Implement C interface for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292915#comment-14292915 ] Zhanwei Wang commented on HDFS-7018: Hi [~cmccabe] In the new patch, 1) simplify the exception handling to only one catch block {code} catch (...) { HandleException(current_exception()); } {code} 2) remove {{PARAMETER_ASSERT}} 3) simplify {{struct hdfsFile_internal}} as you suggested. 4) remove some functions which are not supported in libhdfs. Let's discuss and add them in other jira. As you suggested, searching CLASSPATH for XML configure files will be done in another jira, so I keep the current implementation. Implement C interface for libhdfs3 -- Key: HDFS-7018 URL: https://issues.apache.org/jira/browse/HDFS-7018 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018-pnative.004.patch, HDFS-7018.patch Implement C interface for libhdfs3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7018) Implement C interface for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhanwei Wang updated HDFS-7018: --- Attachment: HDFS-7018-pnative.004.patch Implement C interface for libhdfs3 -- Key: HDFS-7018 URL: https://issues.apache.org/jira/browse/HDFS-7018 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018-pnative.004.patch, HDFS-7018.patch Implement C interface for libhdfs3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292919#comment-14292919 ] Kai Zheng commented on HDFS-7353: - Hi [~zhz] or [~szetszwo], 1. I'm sorry to be about naming again, but regarding better name for dataSize, how about numDataUnits or dataUnitsCount? 2. About why we need the 3rd version encode()/decode(), it is because in above layer in ErasureCoder, ECChunks are extracted from blocks and then they're passed down here for the encoding/decoding. How to get bytes or ByteBuffer from ECChunk, it depends and therefore better have the logic centrally here. Generally, in pure Java implementation, bytes are allocated in heap and used; in ISA-L, better to obtain ByteBuffer from off-heap for performance consideration. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292847#comment-14292847 ] Hadoop QA commented on HDFS-49: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12667794/HDFS-49-002.patch against trunk revision 1f2b695. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestSafeMode org.apache.hadoop.hdfs.server.datanode.TestBlockScanner The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9332//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9332//console This message is automatically generated. MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.204.0, 0.20.205.0, 1.1.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292877#comment-14292877 ] Tsz Wo Nicholas Sze commented on HDFS-3689: --- - Add an if-statement to check targetRepl - src.getBlockReplication() != 0 before adding it to delta since src.computeFileSize() is a little bit expensive. - The if-condition below should check if delta = 0 and the comment if delta is 0 should be updated to if delta is = 0. {code} +if (!fsd.getFSNamesystem().isImageLoaded() || fsd.shouldSkipQuotaChecks()) { + // Do not check quota if delta is 0 or editlog is still being processed + return; {code} Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0 Reporter: Suresh Srinivas Assignee: Jing Zhao Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch, HDFS-3689.002.patch, HDFS-3689.003.patch, HDFS-3689.003.patch, HDFS-3689.004.patch, HDFS-3689.005.patch, HDFS-3689.006.patch, HDFS-3689.007.patch, HDFS-3689.008.patch, HDFS-3689.008.patch, HDFS-3689.009.patch, HDFS-3689.009.patch, editsStored Currently HDFS supports fixed length blocks. Supporting variable length block will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7018) Implement C interface for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhanwei Wang updated HDFS-7018: --- Attachment: HDFS-7018-pnative.004.patch Implement C interface for libhdfs3 -- Key: HDFS-7018 URL: https://issues.apache.org/jira/browse/HDFS-7018 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018-pnative.004.patch, HDFS-7018.patch Implement C interface for libhdfs3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292891#comment-14292891 ] Arpit Agarwal edited comment on HDFS-7584 at 1/27/15 2:50 AM: -- [~xyao], considering the size of the patch do you think we should split it into at least two smaller patches. I can think of at least one natural split: # Part 1 is the the API, protocol and tool changes. # Part 2 is the NameNode implementation. In any case I started looking at the NN changes. Some initial feedback below, mostly nitpicks. I will look at the rest later this week: # I did not understand the todo in INode.java:464. If it is something that would be broken and is not too hard to fix perhaps we should include it in the same checkin? This is perhaps another argument for splitting into two patches at a higher level. # QuotaCounts: It has four telescoping constructors, all private. It is a little confusing. Can we simplify the constructors? e.g. the default constructor can be replaced with initializers. # QuotaCounts: {{typeSpaces}} and {{typeCounts}} are used interchangeably. We should probably name them consistently. # NameNodeLayoutVersion: description of the new layout is too terse, probably unintentional. # Could you please add a short comment for {{ONE_NAMESPACE}}? I realize it was even more confusing before, thanks for adding the static initializer. # {{INode.getQuotaCounts}} - don't need local variable {{qc}}. # Nitpick, optional: {{EnumCounters.allLessOrEqual}} and {{.anyGreatOrEqual}} - can we use foreach loop? # DFSAdmin.java: _The space quota is set onto storage type_ should be _The storage type specific quota is set when ..._ # Unintentional whitespace changes in Quota.java? was (Author: arpitagarwal): [~xyao], considering the size of the patch do you think we should split it into at least two smaller patches. I can think of at least one natural split: # Part 1 is the the API, protocol and tool changes. # Part 2 is the NameNode implementation. In any case I started looking at the NN changes. Some initial feedback below, I will look at the rest later this week: # I did not understand the todo in INode.java:464. If it is something that would be broken and is not too hard to fix perhaps we should include it in the same checkin? This is perhaps another argument for splitting into two patches at a higher level. # QuotaCounts: It has four telescoping constructors, all private. It is a little confusing. e.g. the default constructor can be replaced with initializers. # QuotaCounts: {{typeSpaces}} and {{typeCounts}} are used interchangeably. We should probably name them consistently. # NameNodeLayoutVersion: description of the new layout is too terse, probably unintentional. # Could you please add a short comment for {{ONE_NAMESPACE}}? I realize it was even more confusing before, thanks for adding the static initializer. # {{INode.getQuotaCounts}} - don't need local variable {{qc}}. # Nitpick, optional: {{EnumCounters.allLessOrEqual}} and {{.anyGreatOrEqual}} - can we use foreach loop? # DFSAdmin.java: _The space quota is set onto storage type_ should be _The storage type specific quota is set when ..._ # Unintentional whitespace changes in Quota.java? Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292891#comment-14292891 ] Arpit Agarwal commented on HDFS-7584: - [~xyao], considering the size of the patch do you think we should split it into at least two smaller patches. I can think of at least one natural split: # Part 1 is the the API, protocol and tool changes. # Part 2 is the NameNode implementation. In any case I started looking at the NN changes. Some initial feedback below, I will look at the rest later this week: # I did not understand the todo in INode.java:464. If it is something that would be broken and is not too hard to fix perhaps we should include it in the same checkin? This is perhaps another argument for splitting into two patches at a higher level. # QuotaCounts: It has four telescoping constructors, all private. It is a little confusing. e.g. the default constructor can be replaced with initializers. # QuotaCounts: {{typeSpaces}} and {{typeCounts}} are used interchangeably. We should probably name them consistently. # NameNodeLayoutVersion: description of the new layout is too terse, probably unintentional. # Could you please add a short comment for {{ONE_NAMESPACE}}? I realize it was even more confusing before, thanks for adding the static initializer. # {{INode.getQuotaCounts}} - don't need local variable {{qc}}. # Nitpick, optional: {{EnumCounters.allLessOrEqual}} and {{.anyGreatOrEqual}} - can we use foreach loop? # DFSAdmin.java: _The space quota is set onto storage type_ should be _The storage type specific quota is set when ..._ # Unintentional whitespace changes in Quota.java? Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-7353: Attachment: HDFS-7353-v5.patch Updated the patch according to the above review comments. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292825#comment-14292825 ] Ravi Prakash commented on HDFS-49: -- Patch looks good to me however the two test failures should probably be fixed too. The testReport has probably been deleted from the Jenkins workspace, so I'll just cancel and submit patch once again in the hopes that jenkins will run again. MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.204.0, 0.20.205.0, 1.1.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-49: - Status: Open (was: Patch Available) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.1.0, 0.20.205.0, 0.20.204.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HDFS-5796: -- Attachment: HDFS-5796.3.patch The file system browser in the namenode UI requires SPNEGO. --- Key: HDFS-5796 URL: https://issues.apache.org/jira/browse/HDFS-5796 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Arun Suresh Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, HDFS-5796.3.patch, HDFS-5796.3.patch After HDFS-5382, the browser makes webhdfs REST calls directly, requiring SPNEGO to work between user's browser and namenode. This won't work if the cluster's security infrastructure is isolated from the regular network. Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293051#comment-14293051 ] Xiaoyu Yao commented on HDFS-7584: -- Thanks Arpit again for the feedback. I've fixed your comments 1,2,5. bq. Just an observation, and you don't really need to fix it. I realize you are trying to avoid a new RPC call by modifying the existing ClientProtocol.setQuota call. But it does create a confusing difference between that and DFSClient.setQuota which has two overloads and the matching overload of DFSClient.setQuota behaves differently (throws exception on null). Perhaps it is better to add a new ClientProtocol.setQuota RPC call. Either is fine though. The decision to reuse existing setQuota RPC call compatibly instead of adding a new one is based on the feedbacks from design review. The API that throws exception on null storage type is the new API DFSClient#setQuotaByStorageType with a different signature. We keep the original DFSClient#setQuota(dsQuota, nsQuota) as-is so that it won't break existing clients. With a single setQuota RPC message, I think it should be fine. bq. Do we need the new config key DFS_QUOTA_BY_STORAGETYPE_ENABLED_KEY? The administrator can already choose to avoid configuring per-type quotas so I am not sure the new configuration is useful. Adding the key allow us to completely disable the feature. Without the key, the admin can accidentally configure and enable this feature. I can remove it if this is not needed. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293054#comment-14293054 ] Kai Zheng commented on HDFS-7353: - bq. numDataUnits probably sounds a little better. OK, let me wait another chance to update the patch one more time to use it. I didn't notice your comment when uploading the revision. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292864#comment-14292864 ] Jing Zhao commented on HDFS-7339: - bq. Change the hash function so that consecutive IDs will be mapped to the same hash value and implement BlockGroup.equal(..) so that it returns true with any block id in the group. Had an offline discussion with [~szetszwo] about this just now. This new hash function will cause extra scanning in the bucket, since every 16 contiguous blocks will be mapped to the same bucket. Currently for a large cluster the blocksMap can contain several million buckets, which is in the same scale of the total number of blocks. Thus the current implementation will not have a lot of bucket scan in normal case. Therefore I guess we may need to revisit this optimization and maybe do a simple benchmark about it. Back to this jira, maybe we should consider providing a relative simple implementation first and do optimization in a separate jira. Either only using blocksMap or allocating an extra blockgroupsMap looks fine to me. Maybe we should also schedule an offline discussion sometime this week. Allocating and persisting block groups in NameNode -- Key: HDFS-7339 URL: https://issues.apache.org/jira/browse/HDFS-7339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg All erasure codec operations center around the concept of _block group_; they are formed in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}} is created to record the original and parity blocks in a coding group, as well as a pointer to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes, with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs) is added, which remains empty for “traditional” HDFS files with contiguous block layout. The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}} component; the attached figure has an illustration of the architecture. As a simple example, when a {_Striping+EC_} file is created and written to, it will serve requests from the client to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, {{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292893#comment-14292893 ] Hadoop QA commented on HDFS-7611: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694651/HDFS-7611.001.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot org.apache.hadoop.hdfs.server.datanode.TestBlockScanner Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9334//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9334//console This message is automatically generated. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Byron Wong Priority: Critical Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293018#comment-14293018 ] Xiaoyu Yao commented on HDFS-7584: -- Thanks Arpit for the review and feedback. I will provide a new patch with the update soon. bq. I can think of at least one natural split: Part 1 is the the API, protocol and tool changes. Part 2 is the NameNode implementation. That sounds good to me. I will try splitting it up after finishing the first round of review. bq. QuotaCounts: It has four telescoping constructors, all private. It is a little confusing. Can we simplify the constructors? e.g. the default constructor can be replaced with initializers. Agree. That's something I plan to refactor as well. I tried the build pattern, which could help to main this class when we need add more counters in future that need to be initialized. bq. I did not understand the todo in INode.java:464. If it is something that would be broken and is not too hard to fix perhaps we should include it in the same checkin? This is perhaps another argument for splitting into two patches at a higher level. This is for hadoop fs -count -q command where the majority of the change will not be under hadoop-hdfs-project. It also needs to move the StorageType.java from hadoop-hdfs to hadoop-common, which I prefer to change after the initial check in. bq. QuotaCounts: typeSpaces and typeCounts are used interchangeably. We should probably name them consistently. The parameter in the methods use typeSpaces to be consistent with the other parameters (namespace, diskspace). The class member variable is named typeCounts. bq. Nitpick, optional: EnumCounters.allLessOrEqual and .anyGreatOrEqual - can we use foreach loop? Agree that foreach will make it syntactically neat. Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293092#comment-14293092 ] Konstantin Shvachko commented on HDFS-7677: --- Looks good. Few minor nits on the test: # Could you move {{testTruncate4Symlink()}} just before {{writeContents()}}. It is positioned now between two {{testSnapshotWithAppendTruncate()}} mehods, which should logically be together. # Looks like you truncate on the block boundary, so {{if(!isReady) { ...}} can be replaced with {{assertTrue(Recovery is not expected., isReady);}} # Replacing {{AppendTestUtil.checkFullFile()}} with {{checkFullFile()}} would save a few bytes of code. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292833#comment-14292833 ] Arpit Agarwal commented on HDFS-49: --- +1 for the v002 patch. The test failures are almost certainly unrelated. MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.204.0, 0.20.205.0, 1.1.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7584) Enable Quota Support for Storage Types
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7584: Summary: Enable Quota Support for Storage Types (was: Enable Quota Support for Storage Types (SSD) ) Enable Quota Support for Storage Types -- Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292974#comment-14292974 ] Hadoop QA commented on HDFS-49: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12667794/HDFS-49-002.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9335//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9335//console This message is automatically generated. MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.204.0, 0.20.205.0, 1.1.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293038#comment-14293038 ] Zhe Zhang commented on HDFS-7339: - bq. Back to this jira, maybe we should consider providing a relative simple implementation first and do optimization in a separate jira. Either only using blocksMap or allocating an extra blockgroupsMap looks fine to me. Maybe we should also schedule an offline discussion sometime this week. I agree we should start with a simpler implementation. An in-person meeting is a good idea. I'm free Tuesday (except for 11~12) and Wednesday. Let me know what time works best for you and I can stop by. Allocating and persisting block groups in NameNode -- Key: HDFS-7339 URL: https://issues.apache.org/jira/browse/HDFS-7339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg All erasure codec operations center around the concept of _block group_; they are formed in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}} is created to record the original and parity blocks in a coding group, as well as a pointer to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes, with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs) is added, which remains empty for “traditional” HDFS files with contiguous block layout. The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}} component; the attached figure has an illustration of the architecture. As a simple example, when a {_Striping+EC_} file is created and written to, it will serve requests from the client to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, {{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292822#comment-14292822 ] Tsz Wo Nicholas Sze commented on HDFS-7285: --- I created a Google doc which is editable. If possible please login so I know who's making each comment and update. Thanks for share it. I do suggest that we only share it with the contributors who have intention to edit the doc. Anyone who wants to edit the doc should send a request to you. It could prevent accidental changes from someone reading the doc. Sound good? ... If you don't mind the wait I plan to finish updating it before Wednesday. ... Happy to wait. Please take you time. I will think more about the package name. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292894#comment-14292894 ] Kai Zheng commented on HDFS-7285: - It's good to use 'erasure_code' for a c library directory name, but not so elegant for a Java package name. How about erasurecoding in full. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292788#comment-14292788 ] Hadoop QA commented on HDFS-7411: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694630/hdfs-7411.009.patch against trunk revision 1f2b695. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9331//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9331//console This message is automatically generated. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-49: - Status: Patch Available (was: Open) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.1.0, 0.20.205.0, 0.20.204.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292899#comment-14292899 ] Kai Zheng commented on HDFS-7285: - The rational to use erasurecoding or erasurecode is simple, if no abbreviation sounds comfortable, then use the full words. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7630) TestConnCache hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam liu updated HDFS-7630: -- Attachment: HDFS-7630.002.patch Combined the patches of HDFS-7624 through HDFS-7630 into one(HDFS-7630.002.patch). TestConnCache hardcode block size without considering native OS --- Key: HDFS-7630 URL: https://issues.apache.org/jira/browse/HDFS-7630 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7630.001.patch, HDFS-7630.002.patch TestConnCache hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293040#comment-14293040 ] Zhe Zhang commented on HDFS-7353: - bq. how about numDataUnits or dataUnitsCount Both sound good to me. {{numDataUnits}} probably sounds a little better. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7680) Support dataset-specific choice of short circuit implementation
[ https://issues.apache.org/jira/browse/HDFS-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Pallas updated HDFS-7680: - Description: As described in HDFS-5194, the current support for short circuit reading is tightly coupled to the default Dataset implementation. Since alternative implementations of the FsDatasetSpi may use a different short circuit pathway, there needs to be a way for the client to acquire the right kind of BlockReader. Reviewing some considerations: Today, there is only one dataset per datanode (with multiple volumes). Is that likely to change? Can there be multiple datanodes local to a client? (definition of local might depend on dataset implementation) Is it okay to assume that the client and datanode share configuration? More broadly, how should the client discover the appropriate short-circuit implementation? was: As described in HDFS-5194, the current support for short circuit reading is tightly coupled to the default Dataset implementation. Since alternative implementations of the FsDatasetSpi may use a different short circuit pathway, there needs to be a way for the client to acquire the right kind of BlockReader. Reviewing some considerations: Today, there is only one dataset per datanode (with multiple volumes). Is that likely to change? Can there be multiple datanodes local to a client? Is it okay to assume that the client and datanode share configuration? More broadly, how should the client discover the appropriate short-circuit implementation? Support dataset-specific choice of short circuit implementation --- Key: HDFS-7680 URL: https://issues.apache.org/jira/browse/HDFS-7680 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, dfsclient, hdfs-client Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas As described in HDFS-5194, the current support for short circuit reading is tightly coupled to the default Dataset implementation. Since alternative implementations of the FsDatasetSpi may use a different short circuit pathway, there needs to be a way for the client to acquire the right kind of BlockReader. Reviewing some considerations: Today, there is only one dataset per datanode (with multiple volumes). Is that likely to change? Can there be multiple datanodes local to a client? (definition of local might depend on dataset implementation) Is it okay to assume that the client and datanode share configuration? More broadly, how should the client discover the appropriate short-circuit implementation? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292716#comment-14292716 ] Yi Liu commented on HDFS-7677: -- That's great. So [~shv] and [~cmccabe] please help to take a look at the patch, thanks. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7584) Enable Quota Support for Storage Types (SSD)
[ https://issues.apache.org/jira/browse/HDFS-7584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292745#comment-14292745 ] Arpit Agarwal commented on HDFS-7584: - Few more comments to wrap up feedback on API and protocol. # Continuing from previous, you also don't need the null check in {{PBHelper.convertStorageType}}. # DFSClient.java:3064 - Fix formatting. # Just an observation, and you don't really need to fix it. I realize you are trying to avoid a new RPC call by modifying the existing {{ClientProtocol.setQuota}} call. But it does create a confusing difference between that and {{DFSClient.setQuota}} which has two overloads and the matching overload of {{DFSClient.setQuota}} behaves differently (throws exception on null). Perhaps it is better to add a new {{ClientProtocol.setQuota}} RPC call. Either is fine though. # Do we need the new config key {{DFS_QUOTA_BY_STORAGETYPE_ENABLED_KEY}}? The administrator can already choose to avoid configuring per-type quotas so I am not sure the new configuration is useful. # {{DistributedFileSystem.setQuotaByStorageType}} - the Javadoc _and one or more [Storage Type, space Quota] pairs_ does not match the signature. Looking at the NN changes next. Enable Quota Support for Storage Types (SSD) - Key: HDFS-7584 URL: https://issues.apache.org/jira/browse/HDFS-7584 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7584 Quota by Storage Type - 01202015.pdf, HDFS-7584.0.patch, HDFS-7584.1.patch, HDFS-7584.2.patch, HDFS-7584.3.patch, HDFS-7584.4.patch, editsStored Phase II of the Heterogeneous storage features have completed by HDFS-6584. This JIRA is opened to enable Quota support of different storage types in terms of storage space usage. This is more important for certain storage types such as SSD as it is precious and more performant. As described in the design doc of HDFS-5682, we plan to add new quotaByStorageType command and new name node RPC protocol for it. The quota by storage type feature is applied to HDFS directory level similar to traditional HDFS space quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292763#comment-14292763 ] Jing Zhao commented on HDFS-7682: - Thanks for working on this, [~clamb]. One question about the current patch. The following code means we only do the length checking if the file is complete. Then for a snapshotted while still being written file, we will still have the issue. How about changing the condition to if the src is a snapshot path? Then we can use {{blockLocations.getFileLength}} + {{last block's length if it's incomplete}} as the length limit. {code} +if (blockLocations.isLastBlockComplete()) { + remaining = Math.min(length, blockLocations.getFileLength()); +} {code} {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292775#comment-14292775 ] Zhe Zhang commented on HDFS-7285: - I created a Google [doc | https://docs.google.com/document/d/12YbFDFQGJkx9aCPmtJVslgCIWC6ULnmJQxrIm-e8XFs/edit?usp=sharing] which is editable. If possible please login so I know who's making each comment and update. [~szetszwo] The doc was last updated in mid December and doesn't contain some of the latest updates (mainly from HDFS-7339). If you don't mind the wait I plan to finish updating it before Wednesday. You can also go ahead with your updates assuming the HDFS-7339 discussions were incorporated. Package naming is an interesting topic. *Erasure* doesn't sound very appropriate because it literally means the act of erasing something and is a bit ambiguous itself. Actually erasure coding is [a type of error correction codes | http://en.wikipedia.org/wiki/Forward_error_correction#List_of_error-correcting_codes] so we don't need to worry about the conflict with error correction. The only way to decrease ambiguity in general is to enlengthen the abbreviation. Two potential candidates came to mind: *ecc* standing for error correction codes; or *erc* standing for erasure coding more specifically. Thoughts? Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292887#comment-14292887 ] Tsz Wo Nicholas Sze commented on HDFS-7285: --- How about io.erasure_code? Intel ISA-L also uses erasure_code as the directory name and library name. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292904#comment-14292904 ] Hadoop QA commented on HDFS-5796: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694648/HDFS-5796.3.patch against trunk revision 1f2b695. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-auth hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ipc.TestRPCCallBenchmark Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9333//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9333//console This message is automatically generated. The file system browser in the namenode UI requires SPNEGO. --- Key: HDFS-5796 URL: https://issues.apache.org/jira/browse/HDFS-5796 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Arun Suresh Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, HDFS-5796.3.patch After HDFS-5382, the browser makes webhdfs REST calls directly, requiring SPNEGO to work between user's browser and namenode. This won't work if the cluster's security infrastructure is isolated from the regular network. Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7018) Implement C interface for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhanwei Wang updated HDFS-7018: --- Attachment: (was: HDFS-7018-pnative.004.patch) Implement C interface for libhdfs3 -- Key: HDFS-7018 URL: https://issues.apache.org/jira/browse/HDFS-7018 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018.patch Implement C interface for libhdfs3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7018) Implement C interface for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhanwei Wang updated HDFS-7018: --- Attachment: HDFS-7018-pnative.004.patch Implement C interface for libhdfs3 -- Key: HDFS-7018 URL: https://issues.apache.org/jira/browse/HDFS-7018 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018-pnative.004.patch, HDFS-7018.patch Implement C interface for libhdfs3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-3689: Attachment: HDFS-3689.010.patch Thanks Nicholas! Update the patch to address the comments. bq. The if-condition below should This if-condition has actually been covered by {{verifyQuota}}. Thus I only update the comment here. Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0 Reporter: Suresh Srinivas Assignee: Jing Zhao Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch, HDFS-3689.002.patch, HDFS-3689.003.patch, HDFS-3689.003.patch, HDFS-3689.004.patch, HDFS-3689.005.patch, HDFS-3689.006.patch, HDFS-3689.007.patch, HDFS-3689.008.patch, HDFS-3689.008.patch, HDFS-3689.009.patch, HDFS-3689.009.patch, HDFS-3689.010.patch, editsStored Currently HDFS supports fixed length blocks. Supporting variable length block will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293021#comment-14293021 ] Hadoop QA commented on HDFS-7353: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694709/HDFS-7353-v5.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 3 warning messages. See https://builds.apache.org/job/PreCommit-HDFS-Build/9337//artifact/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9337//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9337//console This message is automatically generated. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-7353: Attachment: HDFS-7353-v6.patch Hope this resolves the Javadoc warnings finally. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7677: - Attachment: HDFS-7677.002.patch Thanks [~shv]! Update the patch to address comments. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch, HDFS-7677.002.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293116#comment-14293116 ] Hadoop QA commented on HDFS-7353: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694726/HDFS-7353-v6.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9340//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9340//console This message is automatically generated. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291559#comment-14291559 ] Yi Liu commented on HDFS-3107: -- Thanks [~shv], I find another two things: * DistributedFileSystem#truncate should resolve symlinks * Expose truncate API via FileContext I open two new JIRAs HDFS-7677, HADOOP-11510, and will fix them there. HDFS truncate - Key: HDFS-3107 URL: https://issues.apache.org/jira/browse/HDFS-3107 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Lei Chang Assignee: Plamen Jeliazkov Fix For: 2.7.0 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml Original Estimate: 1,344h Remaining Estimate: 1,344h Systems with transaction support often need to undo changes made to the underlying storage when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix operation) which is a reverse operation of append, which makes upper layer applications use ugly workarounds (such as keeping track of the discarded byte range per file in a separate metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291614#comment-14291614 ] Hadoop QA commented on HDFS-7353: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694504/HDFS-7353-v4.patch against trunk revision 7b82c4a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 12 warning messages. See https://builds.apache.org/job/PreCommit-HDFS-Build/9327//artifact/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9327//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9327//console This message is automatically generated. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-7353: Attachment: HDFS-7353-v4.patch Updated the patch resolving the Javadoc warnings. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7665) Add definition of truncate preconditions/postconditions to filesystem specification
[ https://issues.apache.org/jira/browse/HDFS-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291695#comment-14291695 ] Steve Loughran commented on HDFS-7665: -- The second of those, {{src/site/markdown/filesystem}} is where I'd like it done; something which says this is what truncate() does, this is the valid state before and after, here are the exceptions you should/must throw, and here is the state afterwards. Once that's done, deriving a new subclass of {{AbstractFSContract}} purely to test truncate operations becomes straightforward: all a new contract test option to declare whether or not an FS supports truncate, and if it does, feed it valid data expect a valid response, then feed it invalid corner case data and expect failures. It's probably wise to mention the lack of guarantees about concurrent access, e.g. if an input stream is open when truncate() occurs, the outcome of all operations other than {{close()}} on that stream are undefined. Add definition of truncate preconditions/postconditions to filesystem specification --- Key: HDFS-7665 URL: https://issues.apache.org/jira/browse/HDFS-7665 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Affects Versions: 3.0.0 Reporter: Steve Loughran Fix For: 3.0.0 With the addition of a major new feature to filesystems, the filesystem specification in hadoop-common/site is now out of sync. This means that # there's no strict specification of what it should do # you can't derive tests from that specification # other people trying to implement the API will have to infer what to do from the HDFS source # there's no way to decide whether or not the HDFS implementation does what it is intended. # without matching tests against the raw local FS, differences between the HDFS impl and the Posix standard one won't be caught until it is potentially too late to fix. The operation should be relatively easy to define (after a truncate, the files bytes [0...len-1] must equal the original bytes, length(file)==len, etc) The truncate tests already written could then be pulled up into contract tests which any filesystem implementation can run against. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7678) Block group reader with decode functionality
[ https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291701#comment-14291701 ] Li Bo commented on HDFS-7678: - This code has not synced with Codec and Namenode code, so I use BlockGroupHelper and CodecHelper when I want to call some functions about codec and BlockGroup. Still a lot of work to do, such as add BlockGroupContiguousReader which read a block group in contiguous layout, and unit tests . Block group reader with decode functionality Key: HDFS-7678 URL: https://issues.apache.org/jira/browse/HDFS-7678 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: BlockGroupReader.patch A block group reader will read data from BlockGroup no matter in striping layout or contiguous layout. The corrupt blocks can be known before reading(told by namenode), or just be found during reading. The block group reader needs to do decoding work when some blocks are found corrupt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7678) Block group reader with decode functionality
Li Bo created HDFS-7678: --- Summary: Block group reader with decode functionality Key: HDFS-7678 URL: https://issues.apache.org/jira/browse/HDFS-7678 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo A block group reader will read data from BlockGroup no matter in striping layout or contiguous layout. The corrupt blocks can be known before reading(told by namenode), or just be found during reading. The block group reader needs to do decoding work when some blocks are found corrupt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7678) Block group reader with decode functionality
[ https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-7678: Attachment: BlockGroupReader.patch Block group reader with decode functionality Key: HDFS-7678 URL: https://issues.apache.org/jira/browse/HDFS-7678 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: BlockGroupReader.patch A block group reader will read data from BlockGroup no matter in striping layout or contiguous layout. The corrupt blocks can be known before reading(told by namenode), or just be found during reading. The block group reader needs to do decoding work when some blocks are found corrupt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7679) EC encode/decode framework
Li Bo created HDFS-7679: --- Summary: EC encode/decode framework Key: HDFS-7679 URL: https://issues.apache.org/jira/browse/HDFS-7679 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo EC tasks such as client striping write, encode replicated file to EC style, transform block layout from contiguous to striping, reconstruct some corrupt blocks, etc, have similar behavior. They all read data from client/BlockGroup, encode/decode them, and write the results to some datanodes. So we can use a unified framework to handle all these tasks. We can use different BlockGroupReader, Coder(ECEncoder/ECDecoder) and BlockWriter to handle different EC tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7679) EC encode/decode framework
[ https://issues.apache.org/jira/browse/HDFS-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-7679: Attachment: ECEncodeDecodeFramework.patch EC encode/decode framework -- Key: HDFS-7679 URL: https://issues.apache.org/jira/browse/HDFS-7679 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: ECEncodeDecodeFramework.patch EC tasks such as client striping write, encode replicated file to EC style, transform block layout from contiguous to striping, reconstruct some corrupt blocks, etc, have similar behavior. They all read data from client/BlockGroup, encode/decode them, and write the results to some datanodes. So we can use a unified framework to handle all these tasks. We can use different BlockGroupReader, Coder(ECEncoder/ECDecoder) and BlockWriter to handle different EC tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7679) EC encode/decode framework
[ https://issues.apache.org/jira/browse/HDFS-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-7679: Assignee: Li Bo EC encode/decode framework -- Key: HDFS-7679 URL: https://issues.apache.org/jira/browse/HDFS-7679 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo EC tasks such as client striping write, encode replicated file to EC style, transform block layout from contiguous to striping, reconstruct some corrupt blocks, etc, have similar behavior. They all read data from client/BlockGroup, encode/decode them, and write the results to some datanodes. So we can use a unified framework to handle all these tasks. We can use different BlockGroupReader, Coder(ECEncoder/ECDecoder) and BlockWriter to handle different EC tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7679) EC encode/decode framework
[ https://issues.apache.org/jira/browse/HDFS-7679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291712#comment-14291712 ] Li Bo commented on HDFS-7679: - The first patch shows the idea of EC encode/decode framework. Still a lot work to do. EC encode/decode framework -- Key: HDFS-7679 URL: https://issues.apache.org/jira/browse/HDFS-7679 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: ECEncodeDecodeFramework.patch EC tasks such as client striping write, encode replicated file to EC style, transform block layout from contiguous to striping, reconstruct some corrupt blocks, etc, have similar behavior. They all read data from client/BlockGroup, encode/decode them, and write the results to some datanodes. So we can use a unified framework to handle all these tasks. We can use different BlockGroupReader, Coder(ECEncoder/ECDecoder) and BlockWriter to handle different EC tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7545) Data striping support in HDFS client
[ https://issues.apache.org/jira/browse/HDFS-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291736#comment-14291736 ] Li Bo commented on HDFS-7545: - hi, Zhe Thanks for your patch. I think writing blocks is more efficient in your way. I have create two jiras and upload some code. We can have a discussion tomorrow morning. Data striping support in HDFS client Key: HDFS-7545 URL: https://issues.apache.org/jira/browse/HDFS-7545 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Li Bo Attachments: DataStripingSupportinHDFSClient.pdf, HDFS-7545-PoC.patch Data striping is a commonly used data layout with critical benefits in the context of erasure coding. This JIRA aims to extend HDFS client to work with striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7545) Data striping support in HDFS client
[ https://issues.apache.org/jira/browse/HDFS-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-7545: Attachment: clientStriping.patch Data striping support in HDFS client Key: HDFS-7545 URL: https://issues.apache.org/jira/browse/HDFS-7545 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Li Bo Attachments: DataStripingSupportinHDFSClient.pdf, HDFS-7545-PoC.patch, clientStriping.patch Data striping is a commonly used data layout with critical benefits in the context of erasure coding. This JIRA aims to extend HDFS client to work with striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7545) Data striping support in HDFS client
[ https://issues.apache.org/jira/browse/HDFS-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291742#comment-14291742 ] Li Bo commented on HDFS-7545: - The patch is based on HDFS-7679 and HDFS-7678. Zhe's patch is also a good solution. We can have a discussion. Still a lot to do about this patch. Data striping support in HDFS client Key: HDFS-7545 URL: https://issues.apache.org/jira/browse/HDFS-7545 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Li Bo Attachments: DataStripingSupportinHDFSClient.pdf, HDFS-7545-PoC.patch, clientStriping.patch Data striping is a commonly used data layout with critical benefits in the context of erasure coding. This JIRA aims to extend HDFS client to work with striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7545) Data striping support in HDFS client
[ https://issues.apache.org/jira/browse/HDFS-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291744#comment-14291744 ] Li Bo commented on HDFS-7545: - DFSStripeInputStream is base on DFSInputStream. If directly modify DFSInputStream, too much code will be added so I choose to create a new class. Data striping support in HDFS client Key: HDFS-7545 URL: https://issues.apache.org/jira/browse/HDFS-7545 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Li Bo Attachments: DataStripingSupportinHDFSClient.pdf, HDFS-7545-PoC.patch, clientStriping.patch Data striping is a commonly used data layout with critical benefits in the context of erasure coding. This JIRA aims to extend HDFS client to work with striped blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292545#comment-14292545 ] Haohui Mai commented on HDFS-6673: -- bq. I think that the point that Andrew is trying to make is that this tool will run quickly on machines with more memory, while still being possible to use on machines with less memory. That's great. The concern that I have is once the LevelDB is bigger than the working set, it requires one seek per inode. It will trash the system at some size of fsimage (which heavily depends on the system that runs the oiv). As many oiv tools that Hadoop have today would like to print out the full path, I would like to set the architecture right and to make sure the issue is addressed. bq. Eventually we will probably want to drop support entirely, perhaps in Hadoop 3.0. There is a maintenance burden associated with maintaining two image formats. Agree. I retired the old format in HDFS-6158 and it was revived in HDFS-6293. The main requirements of oiv are: * The OIV can print out the full path for an inode * The OIV can run on commodity machines like a laptop even for the largest fsimage in production * The reads of the fsimage needs to be disk-friendly, meaning that the number of seeks are minimized. There are two practical solutions that I can see so far: * Convert the fsimage into LevelDB before running the oiv * Tweak saver of the pb-based fsimage so that it stores the inodes using with the order of the full path. It can be done without changing the format of the current fsimage. Maybe we can explore these solutions? Add Delimited format supports for PB OIV tool - Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-42) NetUtils.createSocketAddr NPEs if dfs.datanode.ipc.address is not set for a data node
[ https://issues.apache.org/jira/browse/HDFS-42?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292557#comment-14292557 ] Steve Loughran commented on HDFS-42: HADOOP-5687 duplicated this; it contained the patch NetUtils.createSocketAddr NPEs if dfs.datanode.ipc.address is not set for a data node - Key: HDFS-42 URL: https://issues.apache.org/jira/browse/HDFS-42 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: newbie Fix For: 0.21.0 DataNode.startDatanode assumes that a configuration always returns a non-null dfs.datanode.ipc.address value, as the result is passed straight down to NetUtils.createSocketAddr InetSocketAddress ipcAddr = NetUtils.createSocketAddr( conf.get(dfs.datanode.ipc.address)); which triggers an NPE Caused by: java.lang.NullPointerException at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:130) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:119) at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:353) at org.apache.hadoop.dfs.DataNode.(DataNode.java:185) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-42) NetUtils.createSocketAddr NPEs if dfs.datanode.ipc.address is not set for a data node
[ https://issues.apache.org/jira/browse/HDFS-42?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HDFS-42. Resolution: Duplicate Fix Version/s: 0.21.0 NetUtils.createSocketAddr NPEs if dfs.datanode.ipc.address is not set for a data node - Key: HDFS-42 URL: https://issues.apache.org/jira/browse/HDFS-42 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: newbie Fix For: 0.21.0 DataNode.startDatanode assumes that a configuration always returns a non-null dfs.datanode.ipc.address value, as the result is passed straight down to NetUtils.createSocketAddr InetSocketAddress ipcAddr = NetUtils.createSocketAddr( conf.get(dfs.datanode.ipc.address)); which triggers an NPE Caused by: java.lang.NullPointerException at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:130) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:119) at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:353) at org.apache.hadoop.dfs.DataNode.(DataNode.java:185) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-135) TestEditLog assumes that FSNamesystem.getFSNamesystem().dir is non-null, even after the FSNameSystem is closed
[ https://issues.apache.org/jira/browse/HDFS-135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HDFS-135. - Resolution: Cannot Reproduce TestEditLog assumes that FSNamesystem.getFSNamesystem().dir is non-null, even after the FSNameSystem is closed -- Key: HDFS-135 URL: https://issues.apache.org/jira/browse/HDFS-135 Project: Hadoop HDFS Issue Type: Bug Reporter: Steve Loughran Assignee: Steve Loughran In my modified services, I'm setting {{FSNameSystem.dir}} to {{null}} on {{close()}}: {code} if(dir != null) { dir.close(); dir = null; } {code} This breaks TestEditLog {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:620) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:148) {code} There are two possible conclusions here. # Setting dir=null in {{FSNameSystem.close()}} is a regression and should be fixed # The test contains some assumptions that are not valid I will leave it to others to decide; I will try and fix the code whichever approach is chosen. Personally, I'd go for setting dir=null as it is cleaner, but there is clearly some risk of backward's compatibility problems, at least in test code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292580#comment-14292580 ] Zhe Zhang commented on HDFS-7353: - Good work [~drankye]! Please find my comments below: # {{ECChunk}} looks good to me. It's helpful to handle both array and {{ByteBuffer}} # {{RawErasureCoder}}: #* Please make sure no line is longer than 80 chars #* {{dataSize}}, {{paritySize}}, and {{chunkSize}} apply to all descendants of this interface. Shouldn't they become member variables? #* {{dataSize}} and {{paritySize}} could be better named because _size_ could mean number of bytes too. How about {{numDataBlks}} and {{numParityBlks}}? #* What potential implementation of the interface could need {{release()}}? Better mention it in the comment. # {{RawErasureEncoder}} and {{RawErasureDecoder}} #* Since {{ECChunk}} already wraps around byte arrays and {{ByteBuffer}}, do we still need 3 versions of {{encode}} and {{decode}}? # {{AbstractRawErasureCoder}} #* Shouldn't {{toBuffers}} and {{toArray}} be static methods in the {{ECChunk}} class? All above should be easy to address. +1 pending these updates. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
[ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292595#comment-14292595 ] Zhe Zhang commented on HDFS-7339: - bq. It is not true. For a EC-block, the block group object is stored in the BlocksMap with the first block ID. The BlockGroup.equal(..) is implemented in a way that it returns true for any Block with ID belong to the group. In the example, BlockGroup(0x330).equal(..) returns true for Blocks 0x330..0x338. Agreed. EC blocks indeed only incur 1 lookup. bq. For a non-EC block, it is stored as usual. Given any block ID, it represents either a EC-block or a normal block. So, lookup(0x331) returns either the non-EC Block(0x331) or the BlockGroup(0x330). Could you elaborate a little on the lookup process? I still don't see how we can identify *0x331* with 1 lookup. No matter which key we try first -- *0x331* or *0x330*, there's always a chance to miss. Allocating and persisting block groups in NameNode -- Key: HDFS-7339 URL: https://issues.apache.org/jira/browse/HDFS-7339 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch, HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg All erasure codec operations center around the concept of _block group_; they are formed in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}} is created to record the original and parity blocks in a coding group, as well as a pointer to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes, with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs) is added, which remains empty for “traditional” HDFS files with contiguous block layout. The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}} component; the attached figure has an illustration of the architecture. As a simple example, when a {_Striping+EC_} file is created and written to, it will serve requests from the client to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase, {{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)