[jira] [Updated] (HDFS-8891) HDFS concat should keep srcs order
[ https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Zhang updated HDFS-8891: - Status: Patch Available (was: Open) HDFS concat should keep srcs order -- Key: HDFS-8891 URL: https://issues.apache.org/jira/browse/HDFS-8891 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yong Zhang Assignee: Yong Zhang Attachments: HDFS-8891.001.patch FSDirConcatOp.verifySrcFiles may change src files order, but it should their order as input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7831) Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks()
[ https://issues.apache.org/jira/browse/HDFS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7831: -- Labels: (was: 2.6.1-candidate) Removing the 2.6.1-candidate label as this is not applicable to 2.6. Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks() Key: HDFS-7831 URL: https://issues.apache.org/jira/browse/HDFS-7831 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 2.7.0 Attachments: HDFS-7831-01.patch Currently the loop in {{FileDiffList.findEarlierSnapshotBlocks()}} starts from {{insertPoint + 1}}. It should start from {{insertPoint - 1}}. As noted in [Jing's comment|https://issues.apache.org/jira/browse/HDFS-7056?focusedCommentId=14333864page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14333864] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694855#comment-14694855 ] Yongjun Zhang commented on HDFS-8828: - Hi [~yufeigu], Thanks for the new rev 006 which tries to address the issue we discussed (to avoid re-copying an already copied dir/file which is moved to a newly created dir since last snapshot/distcp). I have some more comments: # Change {{Number of path in the copy list}} to {{Number of paths in the copy list}} # change {code} if (LOG.isDebugEnabled()) { LOG.debug(Path in the copy list: + lastFileStatus.getPath().toUri().getPath()); } {code} To (add an idx and only print in usediff debug mode): if (options.shouldUseDiff() LOG.isDebugEnabled()) { LOG.debug(Copy list entry + idx + : + lastFileStatus.getPath().toUri().getPath()); } ++idx; {code} # Add some more explanation to the javadoc of {{static HashSetString getExcludeList(Path dir, DiffInfo[] renameDiffs, Path prefix}}}, such as: {code} Given a newly created directory newDir in the snapshot diff, if a previously copied file/dirctory itemX is moved (renamed) to below newDir, itemX should be excluded so it will not to be copied again. {code} # the goal of this jira is to only copy modified/created files, all of which would have entries in the snapshot diff report, why we have to call {{traverseDirectory}} to recursively traverse everything in {{doBuildListingWithSnapshotDiff(..}}} in this mode? Sounds to me that we only need to look at each snapshot diff item, and its direct children. (mv ./x/y ./p/q would make two entries in the snapshot diff: ./x and ./y, so do need to care about the first level children of snapshot diff entry). Right? If so, to reuse the code in {{traverseDirectory}}, we can modify {{traverseDirectory}} to support a mode that only cares about the current source and and it's first level children, but not recursively. Thanks. Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7610) Fix removal of dynamically added DN volumes
[ https://issues.apache.org/jira/browse/HDFS-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7610: -- Labels: 2.6.1-candidate (was: ) Fix removal of dynamically added DN volumes --- Key: HDFS-7610 URL: https://issues.apache.org/jira/browse/HDFS-7610 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7610.000.patch, HDFS-7610.001.patch In the hot swap feature, {{FsDatasetImpl#addVolume}} uses the base volume dir (e.g. {{/foo/data0}}, instead of volume's current dir {{/foo/data/current}} to construct {{FsVolumeImpl}}. As a result, DataNode can not remove this newly added volume, because its {{FsVolumeImpl#getBasePath}} returns {{/foo}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694879#comment-14694879 ] Yi Liu commented on HDFS-8859: -- Thanks [~szetszwo] for the review! Update the patch to address your comments. {quote} How about calling it LightWeightResizableGSet? {quote} Agree, rename it in the new patch. {quote} From your calculation, the patch improve each block replica object size about 45%. The JIRA summary is misleading. It seems claiming that it improves the overall DataNode memory footprint by about 45%. For 10m replicas, the original overall map entry object size is ~900 MB and the new size is ~500MB. Is it correct? {quote} It's correct. Actually I added {{ReplicaMap}} in the JIRA summary, yes, I use {{()}}, :), considering the {{ReplicaMap}} is the major in memory long-lived object of Datanode, of course, there are other aspects (most are transient: data read/write buffer, rpc buffer, etc..), I just highlighted the improvement. {quote} Subclass can call super.put(..) {quote} Update in the new patch. I just used to a new internal method . {quote} There is a rewrite for LightWeightGSet.remove(..) {quote} I revert it in the new patch and keep original one. Original implement has duplicate logic, we can share same logic for all the {{if...else..}} branches. {quote} I think we need some long running tests to make sure the correctness. See TestGSet.runMultipleTestGSet() {quote} Agree, updated it in the new patch. For the test failures of {{003}}, it's because there is one place (BlockPoolSlice) add replicaInfo to replicaMap from a tmp replicapMap, but the replicaInfo is still in the tmp one, we can remove it from the tmp one before adding (for LightWeightGSet, an element is not allowed to exist in two gset). In {{002}} patch, the failure doesn't exist, we have a new implement of {{SetIterator}} which is very similar to the logic in java Hashmap, and a bit different with original one, but both are correct, the major difference is the time of finding next element. In the new patch, I keep the original one, and make few change in BlockPoolSlice. All tests run successfully in my local for the new patch. Improve DataNode (ReplicaMap) memory footprint to save about 45% Key: HDFS-8859 URL: https://issues.apache.org/jira/browse/HDFS-8859 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, HDFS-8859.003.patch By using following approach we can save about *45%* memory footprint for each block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in DataNode), the details are: In ReplicaMap, {code} private final MapString, MapLong, ReplicaInfo map = new HashMapString, MapLong, ReplicaInfo(); {code} Currently we use a HashMap {{MapLong, ReplicaInfo}} to store the replicas in memory. The key is block id of the block replica which is already included in {{ReplicaInfo}}, so this memory can be saved. Also HashMap Entry has a object overhead. We can implement a lightweight Set which is similar to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix size for the entries array, usually it's a big value, an example is {{BlocksMap}}, this can avoid full gc since no need to resize), also we should be able to get Element through key. Following is comparison of memory footprint If we implement a lightweight set as described: We can save: {noformat} SIZE (bytes) ITEM 20The Key: Long (12 bytes object overhead + 8 bytes long) 12HashMap Entry object overhead 4 reference to the key in Entry 4 reference to the value in Entry 4 hash in Entry {noformat} Total: -44 bytes We need to add: {noformat} SIZE (bytes) ITEM 4 a reference to next element in ReplicaInfo {noformat} Total: +4 bytes So totally we can save 40bytes for each block replica And currently one finalized replica needs around 46 bytes (notice: we ignore memory alignment here). We can save 1 - (4 + 46) / (44 + 46) = *45%* memory for each block replica in DataNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8859: - Attachment: HDFS-8859.004.patch Improve DataNode (ReplicaMap) memory footprint to save about 45% Key: HDFS-8859 URL: https://issues.apache.org/jira/browse/HDFS-8859 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, HDFS-8859.003.patch, HDFS-8859.004.patch By using following approach we can save about *45%* memory footprint for each block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in DataNode), the details are: In ReplicaMap, {code} private final MapString, MapLong, ReplicaInfo map = new HashMapString, MapLong, ReplicaInfo(); {code} Currently we use a HashMap {{MapLong, ReplicaInfo}} to store the replicas in memory. The key is block id of the block replica which is already included in {{ReplicaInfo}}, so this memory can be saved. Also HashMap Entry has a object overhead. We can implement a lightweight Set which is similar to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix size for the entries array, usually it's a big value, an example is {{BlocksMap}}, this can avoid full gc since no need to resize), also we should be able to get Element through key. Following is comparison of memory footprint If we implement a lightweight set as described: We can save: {noformat} SIZE (bytes) ITEM 20The Key: Long (12 bytes object overhead + 8 bytes long) 12HashMap Entry object overhead 4 reference to the key in Entry 4 reference to the value in Entry 4 hash in Entry {noformat} Total: -44 bytes We need to add: {noformat} SIZE (bytes) ITEM 4 a reference to next element in ReplicaInfo {noformat} Total: +4 bytes So totally we can save 40bytes for each block replica And currently one finalized replica needs around 46 bytes (notice: we ignore memory alignment here). We can save 1 - (4 + 46) / (44 + 46) = *45%* memory for each block replica in DataNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8891) HDFS concat should keep srcs order
[ https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Zhang updated HDFS-8891: - Attachment: HDFS-8891.001.patch First patch, please review HDFS concat should keep srcs order -- Key: HDFS-8891 URL: https://issues.apache.org/jira/browse/HDFS-8891 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yong Zhang Assignee: Yong Zhang Attachments: HDFS-8891.001.patch FSDirConcatOp.verifySrcFiles may change src files order, but it should their order as input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7843: -- Labels: (was: 2.6.1-candidate) Removing the 2.6.1-candidate label as this is not applicable to 2.6.0. A truncated file is corrupted after rollback from a rolling upgrade --- Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Fix For: 2.7.0 Attachments: h7843_20150226.patch Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695077#comment-14695077 ] Hadoop QA commented on HDFS-8808: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 28s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 44s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 27s | The applied patch generated 1 new checkstyle issues (total was 574, now 574). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 32s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 177m 4s | Tests failed in hadoop-hdfs. | | | | 221m 22s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDFSClientRetries | | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750240/HDFS-8808-03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 40f8151 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11986/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11986/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11986/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11986/console | This message was automatically generated. dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby Key: HDFS-8808 URL: https://issues.apache.org/jira/browse/HDFS-8808 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.1 Reporter: Gautam Gopalakrishnan Assignee: Zhe Zhang Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, HDFS-8808-02.patch, HDFS-8808-03.patch The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the speed with which the fsimage is copied between the namenodes during regular use. However, as a side effect, this also limits transfers when the {{-bootstrapStandby}} option is used. This option is often used during upgrades and could potentially slow down the entire workflow. The request here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
Ravikumar created HDFS-8892: --- Summary: ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too Key: HDFS-8892 URL: https://issues.apache.org/jira/browse/HDFS-8892 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.1 Reporter: Ravikumar Priority: Minor Currently CacheCleaner thread checks only for cache-expiry times. It would be nice if it handles an invalid-slot too in an extra-pass of evictable map… for(ShortCircuitReplica replica:evictable.values()) { if(!scr.getSlot().isValid()) { purge(replica); } } //Existing code... int numDemoted = demoteOldEvictableMmaped(curMs); int numPurged = 0; Long evictionTimeNs = Long.valueOf(0); …. ….. Apps like HBase can tweak the expiry/staleness/cache-size params in DFS-Client, so that ShortCircuitReplica will never be closed except when Slot is declared invalid. I assume slot-invalidation will happen during block-invalidation/deletes {Primarily triggered by compaction/shard-takeover etc..} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
[ https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694760#comment-14694760 ] Zhe Zhang commented on HDFS-8854: - Jenkins was not stable and [reported | https://builds.apache.org/job/Hadoop-HDFS-7285-Merge/83/] many unrelated failures. Will wait for another run. Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs -- Key: HDFS-8854 URL: https://issues.apache.org/jira/browse/HDFS-8854 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8854-Consolidated-20150806.02.txt, HDFS-8854-HDFS-7285-merge.03.patch, HDFS-8854-HDFS-7285-merge.03.txt, HDFS-8854-HDFS-7285.00.patch, HDFS-8854-HDFS-7285.01.patch, HDFS-8854-HDFS-7285.02.patch, HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
[ https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694758#comment-14694758 ] Zhe Zhang commented on HDFS-8854: - Will try, thanks! Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs -- Key: HDFS-8854 URL: https://issues.apache.org/jira/browse/HDFS-8854 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-8854-Consolidated-20150806.02.txt, HDFS-8854-HDFS-7285-merge.03.patch, HDFS-8854-HDFS-7285-merge.03.txt, HDFS-8854-HDFS-7285.00.patch, HDFS-8854-HDFS-7285.01.patch, HDFS-8854-HDFS-7285.02.patch, HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8808: Attachment: HDFS-8808-03.patch Updating patch to fix whitespace issue (it was actually not caused by the patch, fixing anyway). {{TestReplaceDatanodeOnFailure}} is unrelated to the change and passes locally. dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby Key: HDFS-8808 URL: https://issues.apache.org/jira/browse/HDFS-8808 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.1 Reporter: Gautam Gopalakrishnan Assignee: Zhe Zhang Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, HDFS-8808-02.patch, HDFS-8808-03.patch The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the speed with which the fsimage is copied between the namenodes during regular use. However, as a side effect, this also limits transfers when the {{-bootstrapStandby}} option is used. This option is often used during upgrades and could potentially slow down the entire workflow. The request here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695119#comment-14695119 ] Hudson commented on HDFS-8622: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #286 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/286/]) HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. Contributed by Jagadesh Kiran N. (aajisaka: rev 40f815131e822f5b7a8e6a6827f4b85b31220c43) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java Implement GETCONTENTSUMMARY operation for WebImageViewer Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695123#comment-14695123 ] Hudson commented on HDFS-8879: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1016 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1016/]) HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode restart. Contributed by Xiaoyu Yao. (xyao: rev 3e715a4f4c46bcd8b3054cb0566e526c46bd5d66) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java Quota by storage type usage incorrectly initialized upon namenode restart - Key: HDFS-8879 URL: https://issues.apache.org/jira/browse/HDFS-8879 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Kihwal Lee Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-8879.01.patch This was found by [~kihwal] as part of HDFS-8865 work in this [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. The unit test testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit failed to detect this because they were using an obsolete FsDirectory instance. Once added the highlighted line below, the issue can be reproed. {code} fsdir = cluster.getNamesystem().getFSDirectory(); INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695126#comment-14695126 ] Hudson commented on HDFS-8622: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1016 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1016/]) HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. Contributed by Jagadesh Kiran N. (aajisaka: rev 40f815131e822f5b7a8e6a6827f4b85b31220c43) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java Implement GETCONTENTSUMMARY operation for WebImageViewer Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695178#comment-14695178 ] Hadoop QA commented on HDFS-8859: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 21s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 52s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 51s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 50s | The applied patch generated 6 new checkstyle issues (total was 12, now 16). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 29s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 33s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 76m 49s | Tests failed in hadoop-hdfs. | | | | 145m 35s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.ha.TestZKFailoverController | | | hadoop.net.TestNetUtils | | | hadoop.hdfs.TestReplication | | | hadoop.hdfs.TestSafeMode | | | hadoop.hdfs.TestDatanodeRegistration | | | hadoop.hdfs.tools.TestDebugAdmin | | | hadoop.hdfs.TestSetrepIncreasing | | | hadoop.hdfs.TestDatanodeReport | | | hadoop.hdfs.TestDFSShellGenericOptions | | | hadoop.hdfs.TestParallelRead | | | hadoop.hdfs.tools.TestStoragePolicyCommands | | | hadoop.hdfs.TestDFSRemove | | | hadoop.hdfs.qjournal.TestSecureNNWithQJM | | | hadoop.hdfs.web.TestWebHdfsTokens | | | hadoop.hdfs.TestHFlush | | | hadoop.hdfs.TestPersistBlocks | | | hadoop.hdfs.TestParallelShortCircuitReadNoChecksum | | | hadoop.hdfs.TestEncryptedTransfer | | | hadoop.hdfs.TestQuota | | | hadoop.hdfs.TestDFSClientFailover | | | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl | | | hadoop.hdfs.tools.TestDFSAdmin | | | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead | | | hadoop.hdfs.web.TestWebHdfsFileSystemContract | | | hadoop.hdfs.web.TestWebHDFS | | | hadoop.hdfs.TestFileAppend | | | hadoop.hdfs.TestFileLengthOnClusterRestart | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForContentSummary | | | hadoop.hdfs.TestFSOutputSummer | | | hadoop.hdfs.TestEncryptionZonesWithHA | | | hadoop.hdfs.TestBlockReaderFactory | | | hadoop.hdfs.TestDFSFinalize | | | hadoop.hdfs.TestDisableConnCache | | | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForXAttr | | | hadoop.hdfs.web.TestHttpsFileSystem | | | hadoop.hdfs.web.TestWebHdfsWithAuthenticationFilter | | | hadoop.hdfs.web.TestWebHDFSAcl | | | hadoop.hdfs.TestHDFSTrash | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.TestDataTransferKeepalive | | | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer | | | hadoop.hdfs.web.TestWebHDFSForHA | | | hadoop.hdfs.TestBlockMissingException | | | hadoop.hdfs.TestPipelines | | | hadoop.hdfs.TestRenameWhileOpen | | | hadoop.hdfs.TestFileCreationClient | | | hadoop.hdfs.TestEncryptionZones | | | hadoop.hdfs.TestFileAppend3 | | | hadoop.hdfs.TestBalancerBandwidth | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.TestSeekBug | | | hadoop.hdfs.TestParallelShortCircuitReadUnCached | | | hadoop.hdfs.TestBlockReaderLocal | | | hadoop.hdfs.TestListFilesInFileContext | | | hadoop.hdfs.web.TestWebHDFSXAttr | | | hadoop.hdfs.TestFileStatus | | | hadoop.hdfs.web.TestFSMainOperationsWebHdfs | | Timed out tests | org.apache.hadoop.hdfs.TestFileCreation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750254/HDFS-8859.004.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 53bef9c | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11987/artifact/patchprocess/diffcheckstylehadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11987/artifact/patchprocess/whitespace.txt | | hadoop-common test log |
[jira] [Assigned] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
[ https://issues.apache.org/jira/browse/HDFS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru reassigned HDFS-8892: - Assignee: kanaka kumar avvaru ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too - Key: HDFS-8892 URL: https://issues.apache.org/jira/browse/HDFS-8892 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.7.1 Reporter: Ravikumar Assignee: kanaka kumar avvaru Priority: Minor Currently CacheCleaner thread checks only for cache-expiry times. It would be nice if it handles an invalid-slot too in an extra-pass of evictable map… for(ShortCircuitReplica replica:evictable.values()) { if(!scr.getSlot().isValid()) { purge(replica); } } //Existing code... int numDemoted = demoteOldEvictableMmaped(curMs); int numPurged = 0; Long evictionTimeNs = Long.valueOf(0); …. ….. Apps like HBase can tweak the expiry/staleness/cache-size params in DFS-Client, so that ShortCircuitReplica will never be closed except when Slot is declared invalid. I assume slot-invalidation will happen during block-invalidation/deletes {Primarily triggered by compaction/shard-takeover etc..} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695116#comment-14695116 ] Hudson commented on HDFS-8879: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #286 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/286/]) HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode restart. Contributed by Xiaoyu Yao. (xyao: rev 3e715a4f4c46bcd8b3054cb0566e526c46bd5d66) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java Quota by storage type usage incorrectly initialized upon namenode restart - Key: HDFS-8879 URL: https://issues.apache.org/jira/browse/HDFS-8879 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Kihwal Lee Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-8879.01.patch This was found by [~kihwal] as part of HDFS-8865 work in this [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. The unit test testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit failed to detect this because they were using an obsolete FsDirectory instance. Once added the highlighted line below, the issue can be reproed. {code} fsdir = cluster.getNamesystem().getFSDirectory(); INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694879#comment-14694879 ] Yi Liu edited comment on HDFS-8859 at 8/13/15 12:02 PM: Thanks [~szetszwo] for the review! Update the patch to address your comments. {quote} How about calling it LightWeightResizableGSet? {quote} Agree, rename it in the new patch. {quote} From your calculation, the patch improve each block replica object size about 45%. The JIRA summary is misleading. It seems claiming that it improves the overall DataNode memory footprint by about 45%. For 10m replicas, the original overall map entry object size is ~900 MB and the new size is ~500MB. Is it correct? {quote} It's correct. I did add {{ReplicaMap}} in the JIRA summary, yes, I use {{()}}, :), considering the {{ReplicaMap}} is the major long-lived object in memory of Datanode which could be large, of course, there are other aspects (many are transient: data read/write buffer, rpc buffer, etc..), I just highlighted the improvement. Let me remove the {{()}}. {quote} Subclass can call super.put(..) {quote} Update in the new patch. I just used to a new internal method . {quote} There is a rewrite for LightWeightGSet.remove(..) {quote} I revert it in the new patch and keep original one. Original implement has duplicate logic, we can share same logic for all the {{if...else..}}. {quote} I think we need some long running tests to make sure the correctness. See TestGSet.runMultipleTestGSet() {quote} Agree, updated it in the new patch. For the test failures of {{003}}, it's because there is one place (BlockPoolSlice) add replicaInfo to replicaMap from a tmp replicapMap, but the replicaInfo is still in the tmp one, we can remove it from the tmp one before adding (for LightWeightGSet, an element is not allowed to exist in two gset). In {{002}} patch, the failure didn't exist, we had a new implement of {{SetIterator}} which was very similar to the logic in java Hashmap, and a bit different with original one. But both are correct, the major difference is the time of finding next element. In the new patch, I keep the original one, and make few change in BlockPoolSlice. All tests run successfully in my local for the new patch. was (Author: hitliuyi): Thanks [~szetszwo] for the review! Update the patch to address your comments. {quote} How about calling it LightWeightResizableGSet? {quote} Agree, rename it in the new patch. {quote} From your calculation, the patch improve each block replica object size about 45%. The JIRA summary is misleading. It seems claiming that it improves the overall DataNode memory footprint by about 45%. For 10m replicas, the original overall map entry object size is ~900 MB and the new size is ~500MB. Is it correct? {quote} It's correct. Actually I added {{ReplicaMap}} in the JIRA summary, yes, I use {{()}}, :), considering the {{ReplicaMap}} is the major in memory long-lived object of Datanode, of course, there are other aspects (most are transient: data read/write buffer, rpc buffer, etc..), I just highlighted the improvement. {quote} Subclass can call super.put(..) {quote} Update in the new patch. I just used to a new internal method . {quote} There is a rewrite for LightWeightGSet.remove(..) {quote} I revert it in the new patch and keep original one. Original implement has duplicate logic, we can share same logic for all the {{if...else..}} branches. {quote} I think we need some long running tests to make sure the correctness. See TestGSet.runMultipleTestGSet() {quote} Agree, updated it in the new patch. For the test failures of {{003}}, it's because there is one place (BlockPoolSlice) add replicaInfo to replicaMap from a tmp replicapMap, but the replicaInfo is still in the tmp one, we can remove it from the tmp one before adding (for LightWeightGSet, an element is not allowed to exist in two gset). In {{002}} patch, the failure doesn't exist, we have a new implement of {{SetIterator}} which is very similar to the logic in java Hashmap, and a bit different with original one, but both are correct, the major difference is the time of finding next element. In the new patch, I keep the original one, and make few change in BlockPoolSlice. All tests run successfully in my local for the new patch. Improve DataNode (ReplicaMap) memory footprint to save about 45% Key: HDFS-8859 URL: https://issues.apache.org/jira/browse/HDFS-8859 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, HDFS-8859.003.patch, HDFS-8859.004.patch By using following approach we can save
[jira] [Updated] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-8859: - Summary: Improve DataNode ReplicaMap memory footprint to save about 45% (was: Improve DataNode (ReplicaMap) memory footprint to save about 45%) Improve DataNode ReplicaMap memory footprint to save about 45% -- Key: HDFS-8859 URL: https://issues.apache.org/jira/browse/HDFS-8859 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yi Liu Assignee: Yi Liu Priority: Critical Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, HDFS-8859.003.patch, HDFS-8859.004.patch By using following approach we can save about *45%* memory footprint for each block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in DataNode), the details are: In ReplicaMap, {code} private final MapString, MapLong, ReplicaInfo map = new HashMapString, MapLong, ReplicaInfo(); {code} Currently we use a HashMap {{MapLong, ReplicaInfo}} to store the replicas in memory. The key is block id of the block replica which is already included in {{ReplicaInfo}}, so this memory can be saved. Also HashMap Entry has a object overhead. We can implement a lightweight Set which is similar to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix size for the entries array, usually it's a big value, an example is {{BlocksMap}}, this can avoid full gc since no need to resize), also we should be able to get Element through key. Following is comparison of memory footprint If we implement a lightweight set as described: We can save: {noformat} SIZE (bytes) ITEM 20The Key: Long (12 bytes object overhead + 8 bytes long) 12HashMap Entry object overhead 4 reference to the key in Entry 4 reference to the value in Entry 4 hash in Entry {noformat} Total: -44 bytes We need to add: {noformat} SIZE (bytes) ITEM 4 a reference to next element in ReplicaInfo {noformat} Total: +4 bytes So totally we can save 40bytes for each block replica And currently one finalized replica needs around 46 bytes (notice: we ignore memory alignment here). We can save 1 - (4 + 46) / (44 + 46) = *45%* memory for each block replica in DataNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8893) DNs with failed volumes stop serving during rolling upgrade
Rushabh S Shah created HDFS-8893: Summary: DNs with failed volumes stop serving during rolling upgrade Key: HDFS-8893 URL: https://issues.apache.org/jira/browse/HDFS-8893 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Rushabh S Shah Priority: Critical When a rolling upgrade starts, all DNs try to write a rolling_upgrade marker to each of their volumes. If one of the volumes is bad, this will fail. When this failure happens, the DN does not update the key it received from the NN. Unfortunately we had one failed volume on all the 3 datanodes which were having replica. Keys expire after 20 hours so at about 20 hours into the rolling upgrade, the DNs with failed volumes will stop serving clients. Here is the stack trace on the datanode size: {noformat} 2015-08-11 07:32:28,827 [DataNode: heartbeating to nn18020] WARN datanode.DataNode: IOException in offerService java.io.IOException: Read-only file system at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:947) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.setRollingUpgradeMarkers(BlockPoolSliceStorage.java:721) at org.apache.hadoop.hdfs.server.datanode.DataStorage.setRollingUpgradeMarker(DataStorage.java:173) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.setRollingUpgradeMarker(FsDatasetImpl.java:2357) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.signalRollingUpgrade(BPOfferService.java:480) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.handleRollingUpgradeStatus(BPServiceActor.java:626) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:677) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:833) at java.lang.Thread.run(Thread.java:722) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8893) DNs with failed volumes stop serving during rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-8893: - Assignee: Daryn Sharp DNs with failed volumes stop serving during rolling upgrade --- Key: HDFS-8893 URL: https://issues.apache.org/jira/browse/HDFS-8893 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Rushabh S Shah Assignee: Daryn Sharp Priority: Critical When a rolling upgrade starts, all DNs try to write a rolling_upgrade marker to each of their volumes. If one of the volumes is bad, this will fail. When this failure happens, the DN does not update the key it received from the NN. Unfortunately we had one failed volume on all the 3 datanodes which were having replica. Keys expire after 20 hours so at about 20 hours into the rolling upgrade, the DNs with failed volumes will stop serving clients. Here is the stack trace on the datanode size: {noformat} 2015-08-11 07:32:28,827 [DataNode: heartbeating to nn18020] WARN datanode.DataNode: IOException in offerService java.io.IOException: Read-only file system at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:947) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.setRollingUpgradeMarkers(BlockPoolSliceStorage.java:721) at org.apache.hadoop.hdfs.server.datanode.DataStorage.setRollingUpgradeMarker(DataStorage.java:173) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.setRollingUpgradeMarker(FsDatasetImpl.java:2357) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.signalRollingUpgrade(BPOfferService.java:480) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.handleRollingUpgradeStatus(BPServiceActor.java:626) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:677) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:833) at java.lang.Thread.run(Thread.java:722) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8891) HDFS concat should keep srcs order
[ https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695320#comment-14695320 ] Hadoop QA commented on HDFS-8891: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 16s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 39s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 21s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 175m 19s | Tests failed in hadoop-hdfs. | | | | 219m 12s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750259/HDFS-8891.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 53bef9c | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11988/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11988/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11988/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11988/console | This message was automatically generated. HDFS concat should keep srcs order -- Key: HDFS-8891 URL: https://issues.apache.org/jira/browse/HDFS-8891 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yong Zhang Assignee: Yong Zhang Attachments: HDFS-8891.001.patch FSDirConcatOp.verifySrcFiles may change src files order, but it should their order as input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8890) Allow admin to specify which blockpools the balancer should run on
[ https://issues.apache.org/jira/browse/HDFS-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695580#comment-14695580 ] Tsz Wo Nicholas Sze commented on HDFS-8890: --- We probably already have this feature since we can specify paths when running Balancer. Allow admin to specify which blockpools the balancer should run on -- Key: HDFS-8890 URL: https://issues.apache.org/jira/browse/HDFS-8890 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Chris Trezzo Assignee: Chris Trezzo Currently the balancer runs on all blockpools. Allow an admin to run the balancer on a set of blockpools. This will enable the balancer to skip blockpools that should not be balanced. For example, a tmp blockpool that has a large amount of churn. An example of the command line interface would be an additional flag that specifies the blockpools by id: -blockpools BP-6299761-10.55.116.188-1415904647555,BP-47348528-10.51.120.139-1415904199257 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8865) Improve quota initialization performance
[ https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695574#comment-14695574 ] Xiaoyu Yao commented on HDFS-8865: -- Thanks for the patch, [~kihwal]! It looks pretty good to me. Just a few comments: 1. The number for large namespace looks impressive. Do you have the number for small/medium namespace? 2. Is it possible to add some profiling info between these logs below so that we can easily find how long it takes to finish quota initialization from the log? {code} LOG.info(Initializing quota with + threads + thread(s)); ... LOG.info(Quota initialization complete.\n + counts); {code} 3. Can you change to parameterized logging to avoid parameter construction in case the log statement is disabled. For example, {code} LOG.debug(Setting quota for {} +\n{}, dir, myCounts); {code} 4. NIT: typo chached - cached? {code} // Directly access the name system to obtain the current chached usage. {code} 5. Now that HDFS-8879 is in, can you rebase and update the patch? Thanks! Improve quota initialization performance Key: HDFS-8865 URL: https://issues.apache.org/jira/browse/HDFS-8865 Project: Hadoop HDFS Issue Type: Improvement Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch After replaying edits, the whole file system tree is recursively scanned in order to initialize the quota. For big name space, this can take a very long time. Since this is done during namenode failover, it also affects failover latency. By using the Fork-Join framework, I was able to greatly reduce the initialization time. The following is the test result using the fsimage from one of the big name nodes we have. || threads || seconds|| | 1 (existing) | 55| | 1 (fork-join) | 68 | | 4 | 16 | | 8 | 8 | | 12 | 6 | | 16 | 5 | | 20 | 4 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8891) HDFS concat should keep srcs order
[ https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695634#comment-14695634 ] Jing Zhao commented on HDFS-8891: - Thanks for working on this, Yong! Agree we should keep the srcs order. For the fix, maybe we only need to replace HashSet to LinkedHashSet? HDFS concat should keep srcs order -- Key: HDFS-8891 URL: https://issues.apache.org/jira/browse/HDFS-8891 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yong Zhang Assignee: Yong Zhang Attachments: HDFS-8891.001.patch FSDirConcatOp.verifySrcFiles may change src files order, but it should their order as input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8890) Allow admin to specify which blockpools the balancer should run on
[ https://issues.apache.org/jira/browse/HDFS-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695585#comment-14695585 ] Tsz Wo Nicholas Sze commented on HDFS-8890: --- Oops, my previous comment is incorrect. Mixing something up. Allow admin to specify which blockpools the balancer should run on -- Key: HDFS-8890 URL: https://issues.apache.org/jira/browse/HDFS-8890 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Chris Trezzo Assignee: Chris Trezzo Currently the balancer runs on all blockpools. Allow an admin to run the balancer on a set of blockpools. This will enable the balancer to skip blockpools that should not be balanced. For example, a tmp blockpool that has a large amount of churn. An example of the command line interface would be an additional flag that specifies the blockpools by id: -blockpools BP-6299761-10.55.116.188-1415904647555,BP-47348528-10.51.120.139-1415904199257 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8894) Set SO_KEEPALIVE on DN server sockets
Nathan Roberts created HDFS-8894: Summary: Set SO_KEEPALIVE on DN server sockets Key: HDFS-8894 URL: https://issues.apache.org/jira/browse/HDFS-8894 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.7.1 Reporter: Nathan Roberts SO_KEEPALIVE is not set on things like datastreamer sockets which can cause lingering ESTABLISHED sockets when there is a network glitch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695685#comment-14695685 ] Yongjun Zhang commented on HDFS-8828: - Hello [~yufeigu], I expect every new CREATE/MODIFICATION below the newly created dir would also have an entry in the snapshot diff report (maybe except the first level children case described in my last comment), is this not the case? Thanks. Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695521#comment-14695521 ] Hudson commented on HDFS-8622: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2232 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2232/]) HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. Contributed by Jagadesh Kiran N. (aajisaka: rev 40f815131e822f5b7a8e6a6827f4b85b31220c43) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java Implement GETCONTENTSUMMARY operation for WebImageViewer Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
[ https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695599#comment-14695599 ] Zhe Zhang commented on HDFS-8808: - Both reported test issues are unrelated and pass locally. The error message from Jenkins test result of {{testIdempotentAllocateBlockAndClose}} is interesting though. We should examine it in a separate JIRA. The checkstyle issue was pre-existing. dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby Key: HDFS-8808 URL: https://issues.apache.org/jira/browse/HDFS-8808 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.1 Reporter: Gautam Gopalakrishnan Assignee: Zhe Zhang Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, HDFS-8808-02.patch, HDFS-8808-03.patch The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the speed with which the fsimage is copied between the namenodes during regular use. However, as a side effect, this also limits transfers when the {{-bootstrapStandby}} option is used. This option is often used during upgrades and could potentially slow down the entire workflow. The request here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth setting -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
[ https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-8854: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: HDFS-7285 Status: Resolved (was: Patch Available) Jenkins still generating unrelated failures sometimes, but we have 1 successful [run | https://builds.apache.org/job/Hadoop-HDFS-7285-Merge/84/]. Committed to both HDFS-7285-merge and HDFS-7285. Thanks Walter for the contribution, and Rakesh for reviewing! Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs -- Key: HDFS-8854 URL: https://issues.apache.org/jira/browse/HDFS-8854 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7285 Reporter: Walter Su Assignee: Walter Su Fix For: HDFS-7285 Attachments: HDFS-8854-Consolidated-20150806.02.txt, HDFS-8854-HDFS-7285-merge.03.patch, HDFS-8854-HDFS-7285-merge.03.txt, HDFS-8854-HDFS-7285.00.patch, HDFS-8854-HDFS-7285.01.patch, HDFS-8854-HDFS-7285.02.patch, HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695518#comment-14695518 ] Hudson commented on HDFS-8879: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2232 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2232/]) HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode restart. Contributed by Xiaoyu Yao. (xyao: rev 3e715a4f4c46bcd8b3054cb0566e526c46bd5d66) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java Quota by storage type usage incorrectly initialized upon namenode restart - Key: HDFS-8879 URL: https://issues.apache.org/jira/browse/HDFS-8879 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Kihwal Lee Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-8879.01.patch This was found by [~kihwal] as part of HDFS-8865 work in this [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. The unit test testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit failed to detect this because they were using an obsolete FsDirectory instance. Once added the highlighted line below, the issue can be reproed. {code} fsdir = cluster.getNamesystem().getFSDirectory(); INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8859: -- Priority: Major (was: Critical) This is a good change although it does not reduce the overall datanode memory footprint much. (For 10m blocks, it only reduces 400MB memory. However, a datanode does not even have 1m blocks in practice.) Improve DataNode ReplicaMap memory footprint to save about 45% -- Key: HDFS-8859 URL: https://issues.apache.org/jira/browse/HDFS-8859 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, HDFS-8859.003.patch, HDFS-8859.004.patch By using following approach we can save about *45%* memory footprint for each block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in DataNode), the details are: In ReplicaMap, {code} private final MapString, MapLong, ReplicaInfo map = new HashMapString, MapLong, ReplicaInfo(); {code} Currently we use a HashMap {{MapLong, ReplicaInfo}} to store the replicas in memory. The key is block id of the block replica which is already included in {{ReplicaInfo}}, so this memory can be saved. Also HashMap Entry has a object overhead. We can implement a lightweight Set which is similar to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix size for the entries array, usually it's a big value, an example is {{BlocksMap}}, this can avoid full gc since no need to resize), also we should be able to get Element through key. Following is comparison of memory footprint If we implement a lightweight set as described: We can save: {noformat} SIZE (bytes) ITEM 20The Key: Long (12 bytes object overhead + 8 bytes long) 12HashMap Entry object overhead 4 reference to the key in Entry 4 reference to the value in Entry 4 hash in Entry {noformat} Total: -44 bytes We need to add: {noformat} SIZE (bytes) ITEM 4 a reference to next element in ReplicaInfo {noformat} Total: +4 bytes So totally we can save 40bytes for each block replica And currently one finalized replica needs around 46 bytes (notice: we ignore memory alignment here). We can save 1 - (4 + 46) / (44 + 46) = *45%* memory for each block replica in DataNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemanja Matkovic updated HDFS-8078: --- Description: 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. was: /patch1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at
[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemanja Matkovic updated HDFS-8078: --- Description: /patch1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226) at java.lang.Thread.run(Thread.java:745) Which also comes as client error -get: 2401 is not an IP string literal. This one has existing parsing logic which needs to shift to the last colon rather than the first. Should also be a tiny bit faster by using lastIndexOf rather than split. Could alternatively use the techniques above. was: 1st exception, on put: 15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception java.lang.IllegalArgumentException: Does not contain a valid host:port authority: 2401:db00:1010:70ba:face:0:8:0:50010 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153) at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588) Appears to actually stem from code in DataNodeID which assumes it's safe to append together (ipaddr + : + port) -- which is OK for IPv4 and not OK for IPv6. NetUtils.createSocketAddr( ) assembles a Java URI object, which requires the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010 Currently using InetAddress.getByName() to validate IPv6 (guava InetAddresses.forString has been flaky) but could also use our own parsing. (From logging this, it seems like a low-enough frequency call that the extra object creation shouldn't be problematic, and for me the slight risk of passing in bad input that is not actually an IPv4 or IPv6 address and thus calling an external DNS lookup is outweighed by getting the address normalized and avoiding rewriting parsing.) Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress() --- 2nd exception (on datanode) 15/04/13 13:18:07 ERROR datanode.DataNode: dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation src: /2401:db00:20:7013:face:0:7:0:54152 dst: /2401:db00:11:d010:face:0:2f:0:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at
[jira] [Updated] (HDFS-8861) Remove unnecessary log from method FSNamesystem.getCorruptFiles
[ https://issues.apache.org/jira/browse/HDFS-8861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobing Zhou updated HDFS-8861: Resolution: Won't Fix Status: Resolved (was: Patch Available) Close it and leave getCorruptFiles unchanged, that warn log is fine. However HDFS-8522 patch is necessary. Remove unnecessary log from method FSNamesystem.getCorruptFiles --- Key: HDFS-8861 URL: https://issues.apache.org/jira/browse/HDFS-8861 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Minor Attachments: HDFS-8861.1.patch The log in FSNamesystem.getCorruptFiles will print out too many messages mixed with other log entries, which makes whole log quite verbose, hard to understood and analyzed, especially in those cases where SuperuserPrivilege check and Operation check are not satisfied in frequent calls of listCorruptFileBlocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696023#comment-14696023 ] Jing Zhao commented on HDFS-8828: - Sure. I will review the patch. Thanks for the work, Yufei and Yongjun! Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696224#comment-14696224 ] Yongjun Zhang commented on HDFS-8828: - Thank you [~yufeigu] and [~jingzhao]! Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch, HDFS-8828.007.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase
[ https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-8435: -- Status: Open (was: Patch Available) createNonRecursive support needed in WebHdfsFileSystem to support HBase --- Key: HDFS-8435 URL: https://issues.apache.org/jira/browse/HDFS-8435 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Affects Versions: 2.6.0 Reporter: Vinoth Sathappan Assignee: Jakob Homan Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, HDFS-8435.002.patch The WebHdfsFileSystem implementation doesn't support createNonRecursive. HBase extensively depends on that for proper functioning. Currently, when the region servers are started over web hdfs, they crash due with - createNonRecursive unsupported for this filesystem class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8888) Support volumes in HDFS
[ https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696039#comment-14696039 ] Konstantin Shvachko commented on HDFS-: --- Could you please explain your concept of volumes. HDFS already has one from federation. I guess you are thinking of something different? Support volumes in HDFS --- Key: HDFS- URL: https://issues.apache.org/jira/browse/HDFS- Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai There are multiple types of zones (e.g., snapshottable directories, encryption zones, directories with quotas) which are conceptually close to namespace volumes in traditional file systems. This jira proposes to introduce the concept of volume to simplify the implementation of snapshots and encryption zones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces
[ https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6244: -- Status: Patch Available (was: Open) Make Trash Interval configurable for each of the namespaces --- Key: HDFS-6244 URL: https://issues.apache.org/jira/browse/HDFS-6244 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.5-alpha Reporter: Siqi Li Assignee: Siqi Li Labels: BB2015-05-TBR Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, HDFS-6244.v3.patch, HDFS-6244.v4.patch Somehow we need to avoid the cluster filling up. One solution is to have a different trash policy per namespace. However, if we can simply make the property configurable per namespace, then the same config can be rolled everywhere and we'd be done. This seems simple enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696167#comment-14696167 ] Yufei Gu commented on HDFS-8828: Thank you, [~jingzhao]. Glad to have you review the code. Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch, HDFS-8828.007.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%
[ https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696262#comment-14696262 ] Yi Liu commented on HDFS-8859: -- Seems Jenkins has some problem and all are timeout, I randomly select 10 of them, they run successfully quickly, let me re-trigger the Jenkins. Improve DataNode ReplicaMap memory footprint to save about 45% -- Key: HDFS-8859 URL: https://issues.apache.org/jira/browse/HDFS-8859 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, HDFS-8859.003.patch, HDFS-8859.004.patch By using following approach we can save about *45%* memory footprint for each block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in DataNode), the details are: In ReplicaMap, {code} private final MapString, MapLong, ReplicaInfo map = new HashMapString, MapLong, ReplicaInfo(); {code} Currently we use a HashMap {{MapLong, ReplicaInfo}} to store the replicas in memory. The key is block id of the block replica which is already included in {{ReplicaInfo}}, so this memory can be saved. Also HashMap Entry has a object overhead. We can implement a lightweight Set which is similar to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix size for the entries array, usually it's a big value, an example is {{BlocksMap}}, this can avoid full gc since no need to resize), also we should be able to get Element through key. Following is comparison of memory footprint If we implement a lightweight set as described: We can save: {noformat} SIZE (bytes) ITEM 20The Key: Long (12 bytes object overhead + 8 bytes long) 12HashMap Entry object overhead 4 reference to the key in Entry 4 reference to the value in Entry 4 hash in Entry {noformat} Total: -44 bytes We need to add: {noformat} SIZE (bytes) ITEM 4 a reference to next element in ReplicaInfo {noformat} Total: +4 bytes So totally we can save 40bytes for each block replica And currently one finalized replica needs around 46 bytes (notice: we ignore memory alignment here). We can save 1 - (4 + 46) / (44 + 46) = *45%* memory for each block replica in DataNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8895) Remove deprecated BlockStorageLocation APIs
[ https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-8895: -- Attachment: HDFS-8895.001.patch Patch attached, deleting lots of the code. I looked at the original patch at HDFS-3672 for guidance as to what to delete, would appreciate a second look that I didn't miss anything. Remove deprecated BlockStorageLocation APIs --- Key: HDFS-8895 URL: https://issues.apache.org/jira/browse/HDFS-8895 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: HDFS-8895.001.patch HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so it can be removed from trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated HDFS-8828: --- Attachment: HDFS-8828.007.patch Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch, HDFS-8828.007.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8895) Remove deprecated BlockStorageLocation APIs
[ https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-8895: -- Status: Patch Available (was: Open) Remove deprecated BlockStorageLocation APIs --- Key: HDFS-8895 URL: https://issues.apache.org/jira/browse/HDFS-8895 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: HDFS-8895.001.patch HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so it can be removed from trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696164#comment-14696164 ] Yufei Gu commented on HDFS-8828: Hi [~yzhangal], Thanks very much for code review. I've done the modification and uploaded the new patch. Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch, HDFS-8828.007.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696000#comment-14696000 ] Yongjun Zhang commented on HDFS-8828: - Hi [~yufeigu], Thanks for answering my question in person. So for newly created dir, there is indeed one entry CREATE in the snapshot diff report, and no entries for new elements created below this dir. So please take care of my comment 1, 2 in my previous review, plus: 3. Suggest to change the {{getExcludeList}} method to {{getTraverseExcludeList}} (hopefully a better name) and with the following javadoc as we agreed. {code} This method returns a list of items to be excluded when recursively traversing newDir to build the copy list. Specifically, given a newly created directory newDir (a CREATE entry in the snapshot diff), if a previously copied file/directory itemX is moved (a RENAME entry in the snapshot diff) into newDir, itemX should be excluded when recursively traversing newDir in #traverseDirectory, so that it will not to be copied again. If the same itemX also has a MODIFY entry in the snapshot diff report, meaning it was modified after it was previously copied, it will still be added to the copy list (handled in the main loop of doBuildListingWithSnapshotDiff). {code} 4. Do refactoring to consolidate duplicated code in test code that we discussed. Hi [~jingzhao], I had quite some side discussion with Yufei, I am +1 on the change after the above comments are addressed. Would you please take a look at it if you wish? I'm targeting at committing it next Monday. Thanks much. Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7649) Multihoming docs should emphasize using hostnames in configurations
[ https://issues.apache.org/jira/browse/HDFS-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696079#comment-14696079 ] Hudson commented on HDFS-7649: -- FAILURE: Integrated in Hadoop-trunk-Commit #8295 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8295/]) HDFS-7649. Multihoming docs should emphasize using hostnames in configurations. (Contributed by Brahma Reddy Battula) (arp: rev ae57d60d8239916312bca7149e2285b2ed3b123a) * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsMultihoming.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Multihoming docs should emphasize using hostnames in configurations --- Key: HDFS-7649 URL: https://issues.apache.org/jira/browse/HDFS-7649 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Arpit Agarwal Assignee: Brahma Reddy Battula Fix For: 2.8.0 Attachments: HDFS-7649.patch The docs should emphasize that master and slave configurations should hostnames wherever possible. Link to current docs: https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696103#comment-14696103 ] Ravi Prakash commented on HDFS-6407: The patch looks good to me. +1. new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Haohui Mai Priority: Critical Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 2.png, sorting table.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8895) Remove deprecated BlockStorageLocation APIs
[ https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-8895: -- Release Note: This removes the deprecated DistributedFileSystem#getFileBlockStorageLocations API used for getting VolumeIds of block replicas. Instead, use BlockLocation#getStorageIds to get very similar information. Remove deprecated BlockStorageLocation APIs --- Key: HDFS-8895 URL: https://issues.apache.org/jira/browse/HDFS-8895 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: HDFS-8895.001.patch HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so it can be removed from trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696162#comment-14696162 ] Yufei Gu commented on HDFS-8828: No. Just one CREATE item in snapshot diff report in this case. Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch, HDFS-8828.007.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase
[ https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-8435: -- Status: Patch Available (was: Open) createNonRecursive support needed in WebHdfsFileSystem to support HBase --- Key: HDFS-8435 URL: https://issues.apache.org/jira/browse/HDFS-8435 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Affects Versions: 2.6.0 Reporter: Vinoth Sathappan Assignee: Jakob Homan Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, HDFS-8435.002.patch, HDFS-8435.003.patch The WebHdfsFileSystem implementation doesn't support createNonRecursive. HBase extensively depends on that for proper functioning. Currently, when the region servers are started over web hdfs, they crash due with - createNonRecursive unsupported for this filesystem class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase
[ https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-8435: -- Attachment: HDFS-8435.003.patch New patch that applies to both trunk and branch 2. The failed tests were because the default of createParent param in WebHDFS was being set to false, but then not being used by the actual call and overridden to true in the create call on the dfsclient. I've fixed this to pay attention to the parameter and updated the spec to be correct. Good catch on the throw. Removed. I had played around with that uber test a bit. Using the annotation loses the explicit method about what went wrong on each test. I put as much into the helper method as looked reasonable (judgment call here); when I put more of the per-test logic into the helper (expected exception, subsequent message), it got really crowded and ugly. createNonRecursive support needed in WebHdfsFileSystem to support HBase --- Key: HDFS-8435 URL: https://issues.apache.org/jira/browse/HDFS-8435 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Affects Versions: 2.6.0 Reporter: Vinoth Sathappan Assignee: Jakob Homan Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, HDFS-8435.002.patch, HDFS-8435.003.patch The WebHdfsFileSystem implementation doesn't support createNonRecursive. HBase extensively depends on that for proper functioning. Currently, when the region servers are started over web hdfs, they crash due with - createNonRecursive unsupported for this filesystem class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85) at org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces
[ https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated HDFS-6244: -- Status: Open (was: Patch Available) Make Trash Interval configurable for each of the namespaces --- Key: HDFS-6244 URL: https://issues.apache.org/jira/browse/HDFS-6244 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.5-alpha Reporter: Siqi Li Assignee: Siqi Li Labels: BB2015-05-TBR Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, HDFS-6244.v3.patch, HDFS-6244.v4.patch Somehow we need to avoid the cluster filling up. One solution is to have a different trash policy per namespace. However, if we can simply make the property configurable per namespace, then the same config can be rolled everywhere and we'd be done. This seems simple enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length
[ https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8883: Priority: Major (was: Minor) NameNode Metrics : Add FSNameSystem lock Queue Length - Key: HDFS-8883 URL: https://issues.apache.org/jira/browse/HDFS-8883 Project: Hadoop HDFS Issue Type: Improvement Components: HDFS Affects Versions: 2.7.1 Reporter: Anu Engineer Assignee: Anu Engineer Fix For: 2.8.0 Attachments: HDFS-8883.001.patch FSNameSystemLock can have contention when NameNode is under load. This patch adds LockQueueLength -- the number of threads waiting on FSNameSystemLock -- as a metric in NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7649) Multihoming docs should emphasize using hostnames in configurations
[ https://issues.apache.org/jira/browse/HDFS-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7649: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) +1 Committed for 2.8.0. Thanks [~brahmareddy]. Multihoming docs should emphasize using hostnames in configurations --- Key: HDFS-7649 URL: https://issues.apache.org/jira/browse/HDFS-7649 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Arpit Agarwal Assignee: Brahma Reddy Battula Fix For: 2.8.0 Attachments: HDFS-7649.patch The docs should emphasize that master and slave configurations should hostnames wherever possible. Link to current docs: https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab
[ https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695800#comment-14695800 ] Chang Li commented on HDFS-6407: [~wheat9] how soon could you check in this code? Are you still waiting for some more reviews? new namenode UI, lost ability to sort columns in datanode tab - Key: HDFS-6407 URL: https://issues.apache.org/jira/browse/HDFS-6407 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.0 Reporter: Nathan Roberts Assignee: Haohui Mai Priority: Critical Labels: BB2015-05-TBR Attachments: 002-datanodes-sorted-capacityUsed.png, 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 2.png, sorting table.png old ui supported clicking on column header to sort on that column. The new ui seems to have dropped this very useful feature. There are a few tables in the Namenode UI to display datanodes information, directory listings and snapshots. When there are many items in the tables, it is useful to have ability to sort on the different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8895) Remove deprecated BlockStorageLocation APIs
Andrew Wang created HDFS-8895: - Summary: Remove deprecated BlockStorageLocation APIs Key: HDFS-8895 URL: https://issues.apache.org/jira/browse/HDFS-8895 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so it can be removed from trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8891) HDFS concat should keep srcs order
[ https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yong Zhang updated HDFS-8891: - Attachment: HDFS-8891.002.patch Thanks [~jingzhao] for review. Upload 2th path base on [~jingzhao]'s comment. But also need to change UT code. HDFS concat should keep srcs order -- Key: HDFS-8891 URL: https://issues.apache.org/jira/browse/HDFS-8891 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yong Zhang Assignee: Yong Zhang Attachments: HDFS-8891.001.patch, HDFS-8891.002.patch FSDirConcatOp.verifySrcFiles may change src files order, but it should their order as input. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated HDFS-8287: - Attachment: HDFS-8287-HDFS-7285.03.patch DFSStripedOutputStream.writeChunk should not wait for writing parity - Key: HDFS-8287 URL: https://issues.apache.org/jira/browse/HDFS-8287 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Kai Sasaki Attachments: HDFS-8287-HDFS-7285.00.patch, HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, HDFS-8287-HDFS-7285.03.patch When a stripping cell is full, writeChunk computes and generates parity packets. It sequentially calls waitAndQueuePacket so that user client cannot continue to write data until it finishes. We should allow user client to continue writing instead but not blocking it when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696349#comment-14696349 ] Kai Sasaki commented on HDFS-8287: -- I rebased HDFS-7285. Could you please check it? Thank you! DFSStripedOutputStream.writeChunk should not wait for writing parity - Key: HDFS-8287 URL: https://issues.apache.org/jira/browse/HDFS-8287 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Kai Sasaki Attachments: HDFS-8287-HDFS-7285.00.patch, HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, HDFS-8287-HDFS-7285.03.patch When a stripping cell is full, writeChunk computes and generates parity packets. It sequentially calls waitAndQueuePacket so that user client cannot continue to write data until it finishes. We should allow user client to continue writing instead but not blocking it when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode
[ https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696376#comment-14696376 ] Zhihua Deng commented on HDFS-7980: --- Recently, we encountered the same problem in our cluster of version 2.4.1 and created a patch(https://github.com/dengzhhu653/hdfs-2.4.1/blob/master/hadoop-241.patch) according to the patch attached. let the restarted NN process the first full report by the faster processFirstBlockReport method, and add an condition AddBlockResult.ADDED==result in addStoredBlockImmediate method when FSNameSystem tries to invoke incrementSafeBlockCount method. The problem is I am not so sure if there exists any potential issues of the patch when I apply it to our cluster , any advises and opinions will be greatly appreciated and taken seriously, thanks! Incremental BlockReport will dramatically slow down the startup of a namenode -- Key: HDFS-7980 URL: https://issues.apache.org/jira/browse/HDFS-7980 Project: Hadoop HDFS Issue Type: Bug Reporter: Hui Zheng Assignee: Walter Su Labels: 2.6.1-candidate Fix For: 2.7.1 Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, HDFS-7980.003.patch, HDFS-7980.004.patch, HDFS-7980.004.repost.patch In the current implementation the datanode will call the reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before calling the bpNamenode.blockReport() method. So in a large(several thousands of datanodes) and busy cluster it will slow down(more than one hour) the startup of namenode. {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = now(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } final ArrayListDatanodeCommand cmds = new ArrayListDatanodeCommand(); // Flush any block information that precedes the block report. Otherwise // we have a chance that we will miss the delHint information // or we will report an RBW replica after the BlockReport already reports // a FINALIZED one. reportReceivedDeletedBlocks(); lastDeletedReport = startTime; . // Send the reports to the NN. int numReportsSent = 0; int numRPCs = 0; boolean success = false; long brSendStartTime = now(); try { if (totalBlockCount dnConf.blockReportSplitThreshold) { // Below split threshold, send all reports in a single message. DatanodeCommand cmd = bpNamenode.blockReport( bpRegistration, bpos.getBlockPoolId(), reports); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696458#comment-14696458 ] Hadoop QA commented on HDFS-8287: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 43s | Findbugs (version ) appears to be broken on HDFS-7285. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 47s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 57s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 39s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:red}-1{color} | eclipse:eclipse | 0m 15s | The patch failed to build with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 0m 25s | Post-patch findbugs hadoop-hdfs-project/hadoop-hdfs compilation is broken. | | {color:green}+1{color} | findbugs | 0m 25s | The patch does not introduce any new Findbugs (version ) warnings. | | {color:red}-1{color} | native | 0m 23s | Failed to build the native portion of hadoop-common prior to running the unit tests in hadoop-hdfs-project/hadoop-hdfs | | | | 37m 9s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750432/HDFS-8287-HDFS-7285.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7285 / 1d37a88 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11993/artifact/patchprocess/patchReleaseAuditProblems.txt | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11993/console | This message was automatically generated. DFSStripedOutputStream.writeChunk should not wait for writing parity - Key: HDFS-8287 URL: https://issues.apache.org/jira/browse/HDFS-8287 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Kai Sasaki Attachments: HDFS-8287-HDFS-7285.00.patch, HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, HDFS-8287-HDFS-7285.03.patch When a stripping cell is full, writeChunk computes and generates parity packets. It sequentially calls waitAndQueuePacket so that user client cannot continue to write data until it finishes. We should allow user client to continue writing instead but not blocking it when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode
[ https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HDFS-7980: -- Attachment: hadoop-241.patch Incremental BlockReport will dramatically slow down the startup of a namenode -- Key: HDFS-7980 URL: https://issues.apache.org/jira/browse/HDFS-7980 Project: Hadoop HDFS Issue Type: Bug Reporter: Hui Zheng Assignee: Walter Su Labels: 2.6.1-candidate Fix For: 2.7.1 Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, HDFS-7980.003.patch, HDFS-7980.004.patch, HDFS-7980.004.repost.patch, hadoop-241.patch In the current implementation the datanode will call the reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before calling the bpNamenode.blockReport() method. So in a large(several thousands of datanodes) and busy cluster it will slow down(more than one hour) the startup of namenode. {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = now(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } final ArrayListDatanodeCommand cmds = new ArrayListDatanodeCommand(); // Flush any block information that precedes the block report. Otherwise // we have a chance that we will miss the delHint information // or we will report an RBW replica after the BlockReport already reports // a FINALIZED one. reportReceivedDeletedBlocks(); lastDeletedReport = startTime; . // Send the reports to the NN. int numReportsSent = 0; int numRPCs = 0; boolean success = false; long brSendStartTime = now(); try { if (totalBlockCount dnConf.blockReportSplitThreshold) { // Below split threshold, send all reports in a single message. DatanodeCommand cmd = bpNamenode.blockReport( bpRegistration, bpos.getBlockPoolId(), reports); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6244) Make Trash Interval configurable for each of the namespaces
[ https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696329#comment-14696329 ] Ming Ma commented on HDFS-6244: --- Thanks [~l201514]! The patch adds key with prefix of dfs.federation to CommonConfigurationKeysPublic. Not sure if that it is a good place to put it given federation is specific to HDFS and CommonConfigurationKeysPublic and Trash are under hadoop-common-project and might be designed to be used by any FileSystem. Your early patch had NameNode read the new property defined in hdfs-site.xml and set the value for {{fs.trash.interval}} before creating {{Trash}}. Any reason not to go with that? {{dfs.federation.trash.interval.ns.}} might be misleading as ns might mean nanosecond. minutes might be better. Another thing, maybe we can drop federation from the name; {{dfs.trash.interval.minutes}} is good enough; just like how {{dfs.namenode.rpc-address}} is used as prefix for different namespaces. It might be useful to add some description for the new property and how it overrides the {{fs.trash.interval}}. The patch includes unrelated FairSchedulerPage. Make Trash Interval configurable for each of the namespaces --- Key: HDFS-6244 URL: https://issues.apache.org/jira/browse/HDFS-6244 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.5-alpha Reporter: Siqi Li Assignee: Siqi Li Labels: BB2015-05-TBR Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, HDFS-6244.v3.patch, HDFS-6244.v4.patch Somehow we need to avoid the cluster filling up. One solution is to have a different trash policy per namespace. However, if we can simply make the property configurable per namespace, then the same config can be rolled everywhere and we'd be done. This seems simple enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode
[ https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HDFS-7980: -- Attachment: (was: hadoop-241.patch) Incremental BlockReport will dramatically slow down the startup of a namenode -- Key: HDFS-7980 URL: https://issues.apache.org/jira/browse/HDFS-7980 Project: Hadoop HDFS Issue Type: Bug Reporter: Hui Zheng Assignee: Walter Su Labels: 2.6.1-candidate Fix For: 2.7.1 Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, HDFS-7980.003.patch, HDFS-7980.004.patch, HDFS-7980.004.repost.patch In the current implementation the datanode will call the reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before calling the bpNamenode.blockReport() method. So in a large(several thousands of datanodes) and busy cluster it will slow down(more than one hour) the startup of namenode. {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = now(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } final ArrayListDatanodeCommand cmds = new ArrayListDatanodeCommand(); // Flush any block information that precedes the block report. Otherwise // we have a chance that we will miss the delHint information // or we will report an RBW replica after the BlockReport already reports // a FINALIZED one. reportReceivedDeletedBlocks(); lastDeletedReport = startTime; . // Send the reports to the NN. int numReportsSent = 0; int numRPCs = 0; boolean success = false; long brSendStartTime = now(); try { if (totalBlockCount dnConf.blockReportSplitThreshold) { // Below split threshold, send all reports in a single message. DatanodeCommand cmd = bpNamenode.blockReport( bpRegistration, bpos.getBlockPoolId(), reports); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7915) The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell the DFSClient about it because of a network error
[ https://issues.apache.org/jira/browse/HDFS-7915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-7915: Attachment: HDFS-7915.branch-2.6.patch HADOOP-11802 depends on this issue. If we are going to cherry-pick HADOOP-11802, we need to cherry-pick this issue first. Attaching a patch for branch-2.6. The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell the DFSClient about it because of a network error - Key: HDFS-7915 URL: https://issues.apache.org/jira/browse/HDFS-7915 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: 2.7.0 Attachments: HDFS-7915.001.patch, HDFS-7915.002.patch, HDFS-7915.004.patch, HDFS-7915.005.patch, HDFS-7915.006.patch, HDFS-7915.branch-2.6.patch The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell the DFSClient about it because of a network error. In {{DataXceiver#requestShortCircuitFds}}, the DataNode can succeed at the first part (mark the slot as used) and fail at the second part (tell the DFSClient what it did). The try block for unregistering the slot only covers a failure in the first part, not the second part. In this way, a divergence can form between the views of which slots are allocated on DFSClient and on server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8070) Pre-HDFS-7915 DFSClient cannot use short circuit on post-HDFS-7915 DataNode
[ https://issues.apache.org/jira/browse/HDFS-8070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-8070: Attachment: HDFS-8070.branch-2.6.patch Attaching a patch for branch-2.6. If we are going to include HADOOP-11802, we need to include HDFS-7915 and this issue as well. Pre-HDFS-7915 DFSClient cannot use short circuit on post-HDFS-7915 DataNode --- Key: HDFS-8070 URL: https://issues.apache.org/jira/browse/HDFS-8070 Project: Hadoop HDFS Issue Type: Bug Components: caching Affects Versions: 2.7.0 Reporter: Gopal V Assignee: Colin Patrick McCabe Priority: Blocker Fix For: 2.7.1 Attachments: HDFS-8070.001.patch, HDFS-8070.branch-2.6.patch HDFS ShortCircuitShm layer keeps the task locked up during multi-threaded split-generation. I hit this immediately after I upgraded the data, so I wonder if the ShortCircuitShim wire protocol has trouble when 2.8.0 DN talks to a 2.7.0 Client? {code} 2015-04-06 00:04:30,780 INFO [ORC_GET_SPLITS #3] orc.OrcInputFormat: ORC pushdown predicate: leaf-0 = (IS_NULL ss_sold_date_sk) expr = (not leaf-0) 2015-04-06 00:04:30,781 ERROR [ShortCircuitCache_SlotReleaser] shortcircuit.ShortCircuitCache: ShortCircuitCache(0x29e82045): failed to release short-circuit shared memory slot Slot(slotIdx=2, shm=DfsClientShm(a86ee34576d93c4964005d90b0d97c38)) by sending ReleaseShortCircuitAccessRequestProto to /grid/0/cluster/hdfs/dn_socket. Closing shared memory segment. java.io.IOException: ERROR_INVALID: there is no shared memory segment registered with shmId a86ee34576d93c4964005d90b0d97c38 at org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache$SlotReleaser.run(ShortCircuitCache.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2015-04-06 00:04:30,781 INFO [ORC_GET_SPLITS #5] orc.OrcInputFormat: ORC pushdown predicate: leaf-0 = (IS_NULL ss_sold_date_sk) expr = (not leaf-0) 2015-04-06 00:04:30,781 WARN [ShortCircuitCache_SlotReleaser] shortcircuit.DfsClientShmManager: EndpointShmManager(172.19.128.60:50010, parent=ShortCircuitShmManager(5e763476)): error shutting down shm: got IOException calling shutdown(SHUT_RDWR) java.nio.channels.ClosedChannelException at org.apache.hadoop.util.CloseableReferenceCount.reference(CloseableReferenceCount.java:57) at org.apache.hadoop.net.unix.DomainSocket.shutdown(DomainSocket.java:387) at org.apache.hadoop.hdfs.shortcircuit.DfsClientShmManager$EndpointShmManager.shutdown(DfsClientShmManager.java:378) at org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache$SlotReleaser.run(ShortCircuitCache.java:223) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2015-04-06 00:04:30,783 INFO [ORC_GET_SPLITS #7] orc.OrcInputFormat: ORC pushdown predicate: leaf-0 = (IS_NULL cs_sold_date_sk) expr = (not leaf-0) 2015-04-06 00:04:30,785 ERROR [ShortCircuitCache_SlotReleaser] shortcircuit.ShortCircuitCache: ShortCircuitCache(0x29e82045): failed to release short-circuit shared memory slot Slot(slotIdx=4, shm=DfsClientShm(a86ee34576d93c4964005d90b0d97c38)) by sending ReleaseShortCircuitAccessRequestProto to /grid/0/cluster/hdfs/dn_socket. Closing shared memory segment. java.io.IOException: ERROR_INVALID: there is no shared memory segment registered with shmId a86ee34576d93c4964005d90b0d97c38 at org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache$SlotReleaser.run(ShortCircuitCache.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at
[jira] [Commented] (HDFS-8895) Remove deprecated BlockStorageLocation APIs
[ https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696399#comment-14696399 ] Hadoop QA commented on HDFS-8895: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 23s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 4 new or modified test files. | | {color:green}+1{color} | javac | 7m 56s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 45s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 34s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 10s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 0m 26s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 28s | Tests passed in hadoop-hdfs-client. | | | | 50m 47s | | \\ \\ || Reason || Tests || | Failed build | hadoop-hdfs | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750398/HDFS-8895.001.patch | | Optional Tests | javac unit javadoc findbugs checkstyle | | git revision | trunk / 0a03054 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11991/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11991/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11991/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11991/console | This message was automatically generated. Remove deprecated BlockStorageLocation APIs --- Key: HDFS-8895 URL: https://issues.apache.org/jira/browse/HDFS-8895 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: HDFS-8895.001.patch HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so it can be removed from trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-8622: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) I've committed this to trunk and branch-2. Thanks [~jagadesh.kiran] for the continuous work! Implement GETCONTENTSUMMARY operation for WebImageViewer Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Fix For: 2.8.0 Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-6939) Support path-based filtering of inotify events
[ https://issues.apache.org/jira/browse/HDFS-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore reassigned HDFS-6939: Assignee: Surendra Singh Lilhore Support path-based filtering of inotify events -- Key: HDFS-6939 URL: https://issues.apache.org/jira/browse/HDFS-6939 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, namenode, qjm Reporter: James Thomas Assignee: Surendra Singh Lilhore Users should be able to specify that they only want events involving particular paths. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7649) Multihoming docs should emphasize using hostnames in configurations
[ https://issues.apache.org/jira/browse/HDFS-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696288#comment-14696288 ] Brahma Reddy Battula commented on HDFS-7649: [~arpitagarwal] thanks a lot for your review and commit!! Multihoming docs should emphasize using hostnames in configurations --- Key: HDFS-7649 URL: https://issues.apache.org/jira/browse/HDFS-7649 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Arpit Agarwal Assignee: Brahma Reddy Battula Fix For: 2.8.0 Attachments: HDFS-7649.patch The docs should emphasize that master and slave configurations should hostnames wherever possible. Link to current docs: https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7278) Add a command that allows sysadmins to manually trigger full block reports from a DN
[ https://issues.apache.org/jira/browse/HDFS-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7278: -- Labels: 2.6.1-candidate (was: ) Add a command that allows sysadmins to manually trigger full block reports from a DN Key: HDFS-7278 URL: https://issues.apache.org/jira/browse/HDFS-7278 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7278.002.patch, HDFS-7278.003.patch, HDFS-7278.004.patch, HDFS-7278.005.patch We should add a command that allows sysadmins to manually trigger full block reports from a DN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696383#comment-14696383 ] Hadoop QA commented on HDFS-8828: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 43s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 41s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 40s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 26s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 4s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 24s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 49s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | tools/hadoop tests | 6m 26s | Tests passed in hadoop-distcp. | | | | 43m 13s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750404/HDFS-8828.007.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 0a03054 | | hadoop-distcp test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11990/artifact/patchprocess/testrun_hadoop-distcp.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11990/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11990/console | This message was automatically generated. Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch, HDFS-8828.007.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase
[ https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696394#comment-14696394 ] Hadoop QA commented on HDFS-8435: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 24m 56s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:red}-1{color} | javac | 7m 46s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | javadoc | 10m 10s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 58s | Site still builds. | | {color:red}-1{color} | checkstyle | 3m 38s | The applied patch generated 1 new checkstyle issues (total was 104, now 105). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 6m 34s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 23m 20s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 118m 24s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 28s | Tests passed in hadoop-hdfs-client. | | | | 200m 44s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.ha.TestZKFailoverController | | | hadoop.net.TestNetUtils | | Timed out tests | org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot | | | org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation | | | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockReportRateLimiting | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750390/HDFS-8435.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / 0a03054 | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/diffJavacWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11989/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11989/console | This message was automatically generated. createNonRecursive support needed in WebHdfsFileSystem to support HBase --- Key: HDFS-8435 URL: https://issues.apache.org/jira/browse/HDFS-8435 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Affects Versions: 2.6.0 Reporter: Vinoth Sathappan Assignee: Jakob Homan Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, HDFS-8435.002.patch, HDFS-8435.003.patch The WebHdfsFileSystem implementation doesn't support createNonRecursive. HBase extensively depends on that for proper functioning. Currently, when the region servers are started over web hdfs, they crash due with - createNonRecursive unsupported for this filesystem class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112) at org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85) at
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695454#comment-14695454 ] Hudson commented on HDFS-8879: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #275 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/275/]) HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode restart. Contributed by Xiaoyu Yao. (xyao: rev 3e715a4f4c46bcd8b3054cb0566e526c46bd5d66) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java Quota by storage type usage incorrectly initialized upon namenode restart - Key: HDFS-8879 URL: https://issues.apache.org/jira/browse/HDFS-8879 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Kihwal Lee Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-8879.01.patch This was found by [~kihwal] as part of HDFS-8865 work in this [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. The unit test testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit failed to detect this because they were using an obsolete FsDirectory instance. Once added the highlighted line below, the issue can be reproed. {code} fsdir = cluster.getNamesystem().getFSDirectory(); INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695457#comment-14695457 ] Hudson commented on HDFS-8622: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #275 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/275/]) HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. Contributed by Jagadesh Kiran N. (aajisaka: rev 40f815131e822f5b7a8e6a6827f4b85b31220c43) * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java Implement GETCONTENTSUMMARY operation for WebImageViewer Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
[ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695467#comment-14695467 ] Yufei Gu commented on HDFS-8828: Hi [~yzhangal], Thank you for detailed review. For 3, we do need recursively traverse because a created directory item in a snapshot diff report could have multiple levels of subdirectories. Utilize Snapshot diff report to build copy list in distcp - Key: HDFS-8828 URL: https://issues.apache.org/jira/browse/HDFS-8828 Project: Hadoop HDFS Issue Type: Improvement Components: distcp, snapshots Reporter: Yufei Gu Assignee: Yufei Gu Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, HDFS-8828.006.patch Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M files). We can leverage snapshot diff report to build file copy list including files/dirs which are changes only between two snapshots (or a snapshot and a normal dir). It speed up the process in two folds: 1. less copy list building time. 2. less file copy MR jobs. HDFS snapshot diff report provide information about file/directory creation, deletion, rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535 synchronize deletion and rename, then fallback to the default distcp. So it still relies on default distcp to building complete list of files under the source dir. This patch only puts creation and modification files into the copy list based on snapshot diff report. We can minimize the number of files to copy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695479#comment-14695479 ] Hudson commented on HDFS-8879: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #283 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/283/]) HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode restart. Contributed by Xiaoyu Yao. (xyao: rev 3e715a4f4c46bcd8b3054cb0566e526c46bd5d66) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Quota by storage type usage incorrectly initialized upon namenode restart - Key: HDFS-8879 URL: https://issues.apache.org/jira/browse/HDFS-8879 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Kihwal Lee Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-8879.01.patch This was found by [~kihwal] as part of HDFS-8865 work in this [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. The unit test testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit failed to detect this because they were using an obsolete FsDirectory instance. Once added the highlighted line below, the issue can be reproed. {code} fsdir = cluster.getNamesystem().getFSDirectory(); INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695482#comment-14695482 ] Hudson commented on HDFS-8622: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #283 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/283/]) HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. Contributed by Jagadesh Kiran N. (aajisaka: rev 40f815131e822f5b7a8e6a6827f4b85b31220c43) * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java Implement GETCONTENTSUMMARY operation for WebImageViewer Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-8894) Set SO_KEEPALIVE on DN server sockets
[ https://issues.apache.org/jira/browse/HDFS-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru reassigned HDFS-8894: - Assignee: kanaka kumar avvaru Set SO_KEEPALIVE on DN server sockets - Key: HDFS-8894 URL: https://issues.apache.org/jira/browse/HDFS-8894 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.7.1 Reporter: Nathan Roberts Assignee: kanaka kumar avvaru SO_KEEPALIVE is not set on things like datastreamer sockets which can cause lingering ESTABLISHED sockets when there is a network glitch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8824) Do not use small blocks for balancing the cluster
[ https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696520#comment-14696520 ] Jitendra Nath Pandey commented on HDFS-8824: +1 for the latest patch. Do not use small blocks for balancing the cluster - Key: HDFS-8824 URL: https://issues.apache.org/jira/browse/HDFS-8824 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8824_20150727b.patch, h8824_20150811b.patch Balancer gets datanode block lists from NN and then move the blocks in order to balance the cluster. It should not use the blocks with small size since moving the small blocks generates a lot of overhead and the small blocks do not help balancing the cluster much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart
[ https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695387#comment-14695387 ] Hudson commented on HDFS-8879: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2213 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2213/]) HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode restart. Contributed by Xiaoyu Yao. (xyao: rev 3e715a4f4c46bcd8b3054cb0566e526c46bd5d66) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Quota by storage type usage incorrectly initialized upon namenode restart - Key: HDFS-8879 URL: https://issues.apache.org/jira/browse/HDFS-8879 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Kihwal Lee Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-8879.01.patch This was found by [~kihwal] as part of HDFS-8865 work in this [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904]. The unit test testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit failed to detect this because they were using an obsolete FsDirectory instance. Once added the highlighted line below, the issue can be reproed. {code} fsdir = cluster.getNamesystem().getFSDirectory(); INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString()); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7926) NameNode implementation of ClientProtocol.truncate(..) is not idempotent
[ https://issues.apache.org/jira/browse/HDFS-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7926: -- Labels: (was: 2.6.1-candidate) Removing the 2.6.1-candidate label as truncate is not a feature in 2.6. NameNode implementation of ClientProtocol.truncate(..) is not idempotent Key: HDFS-7926 URL: https://issues.apache.org/jira/browse/HDFS-7926 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: 2.7.0 Attachments: h7926_20150313.patch, h7926_20150313b.patch If dfsclient drops the first response of a truncate RPC call, the retry by retry cache will fail with DFSClient ... is already the current lease holder. The truncate RPC is annotated as @Idempotent in ClientProtocol but the NameNode implementation is not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer
[ https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695390#comment-14695390 ] Hudson commented on HDFS-8622: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2213 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2213/]) HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. Contributed by Jagadesh Kiran N. (aajisaka: rev 40f815131e822f5b7a8e6a6827f4b85b31220c43) * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java Implement GETCONTENTSUMMARY operation for WebImageViewer Key: HDFS-8622 URL: https://issues.apache.org/jira/browse/HDFS-8622 Project: Hadoop HDFS Issue Type: New Feature Reporter: Jagadesh Kiran N Assignee: Jagadesh Kiran N Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch it would be better for administrators if {code} GETCONTENTSUMMARY {code} are supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332)