[jira] [Commented] (HDFS-7948) TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows
[ https://issues.apache.org/jira/browse/HDFS-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366812#comment-14366812 ] Hadoop QA commented on HDFS-7948: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705273/HDFS-7948.00.patch against trunk revision 5b322c6. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9952//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9952//console This message is automatically generated. TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows -- Key: HDFS-7948 URL: https://issues.apache.org/jira/browse/HDFS-7948 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7948.00.patch The failure below is due to File#getCanonicalPath() does not work with File object created with URI format (file:/c:/users/xyz/test/data/dfs/data/new_vol1) on Windows. I will post a fix shortly. {code} testAddVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes) Time elapsed: 5.746 sec ERROR! java.io.IOException: The filename, directory name, or volume label syntax is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:414) at java.io.File.getCanonicalPath(File.java:589) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumeFailures(TestDataNodeHotSwapVolumes.java:525) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7891: Attachment: HDFS-7891.003.patch A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366860#comment-14366860 ] Walter Su commented on HDFS-7891: - I upload a new patch {{HDFS-7891.003.patch}}. It use random policy with {{maxNodesPerRack}} . *Design* R = total racks in cluster X = total expected replicas Q=Math.floor(X/R) T=X%R if XR, X racks have 1 replica. if XR T==0, R racks have Q replicas. if XR T!=0,, X-T racks have Q replicas. T racks have Q+1 replicas. *Coding* 1. Add a function getMaxNodesPerRack(..) 2. Call getMaxNodesPerRack(..) before every choosing *Defects* I copy a lot of code from {{BlockPlacementPolicyDefault}} . If I didn't do this, I have to modify {{BlockPlacementPolicyDefault}} so it can call an empty getMaxNodesPerRack(..) before every choosing, and I can overide it on child class. A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7261) storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState()
[ https://issues.apache.org/jira/browse/HDFS-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HDFS-7261: --- Attachment: HDFS-7261-002.patch storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState() --- Key: HDFS-7261 URL: https://issues.apache.org/jira/browse/HDFS-7261 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Brahma Reddy Battula Attachments: HDFS-7261-001.patch, HDFS-7261-002.patch, HDFS-7261.patch Here is the code: {code} failedStorageInfos = new HashSetDatanodeStorageInfo( storageMap.values()); {code} In other places, the lock on DatanodeDescriptor.storageMap is held: {code} synchronized (storageMap) { final CollectionDatanodeStorageInfo storages = storageMap.values(); return storages.toArray(new DatanodeStorageInfo[storages.size()]); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7950) Fix TestFsDatasetImpl#testAddVolumes failure on Windows
[ https://issues.apache.org/jira/browse/HDFS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7950: - Attachment: HDFS-7950.01.patch Found a separate issue with TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux/Mac but not windows. Will open separate JIRA for it. Fix TestFsDatasetImpl#testAddVolumes failure on Windows --- Key: HDFS-7950 URL: https://issues.apache.org/jira/browse/HDFS-7950 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7950.00.patch, HDFS-7950.01.patch The test should use Iterables.elementsEqual() instead of Junit AssertEquals to compare two object list. I will post a patch shortly. {code} testAddVolumes(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl) Time elapsed: 0.116 sec FAILURE! java.lang.AssertionError: expected:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData1, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData0] but was:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData0, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData1] at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumes(TestFsDatasetImpl.java:165) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7951) Fix NPE for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux
Xiaoyu Yao created HDFS-7951: Summary: Fix NPE for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux Key: HDFS-7951 URL: https://issues.apache.org/jira/browse/HDFS-7951 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao {code} acquired by nodename 17386@HW11217.local 2015-03-18 01:07:06,036 WARN common.Util (Util.java:stringAsURI(56)) - Path target/test/data/iFEYz7UKAU/bad should be specified as a URI in configuration files. Please update hdfs configuration. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addVolume(FsDatasetImpl.java:403) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumeFailureReleasesInUseLock(TestFsDatasetImpl.java:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7950) Fix TestFsDatasetImpl#testAddVolumes failure on Windows
[ https://issues.apache.org/jira/browse/HDFS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366862#comment-14366862 ] Hadoop QA commented on HDFS-7950: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705277/HDFS-7950.00.patch against trunk revision 5b322c6. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9953//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9953//console This message is automatically generated. Fix TestFsDatasetImpl#testAddVolumes failure on Windows --- Key: HDFS-7950 URL: https://issues.apache.org/jira/browse/HDFS-7950 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7950.00.patch, HDFS-7950.01.patch, HDFS-7950.02.patch The test should use Iterables.elementsEqual() instead of Junit AssertEquals to compare two object list. I will post a patch shortly. {code} testAddVolumes(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl) Time elapsed: 0.116 sec FAILURE! java.lang.AssertionError: expected:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData1, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData0] but was:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData0, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData1] at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumes(TestFsDatasetImpl.java:165) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366868#comment-14366868 ] Walter Su commented on HDFS-7891: - 4th. Another reason I prefer random policy with {{maxNodesPerRack}} is that I don't have to worry about balance. The random algorithm chooses node from all nodes. It doesn't chooses rack from all racks. If the distribution of nodes on racks is skew, the policy will keep space balance. A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366903#comment-14366903 ] Walter Su commented on HDFS-7891: - The CPU time of random policy with maxNodesPerRack is less than 1/10 msec per call. Is it necessary using sorted rack method? Maybe not, IMO. A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7950) Fix TestFsDatasetImpl#testAddVolumes failure on Windows
[ https://issues.apache.org/jira/browse/HDFS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7950: - Attachment: HDFS-7950.02.patch Remove the changes for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock that is included by accident. Fix TestFsDatasetImpl#testAddVolumes failure on Windows --- Key: HDFS-7950 URL: https://issues.apache.org/jira/browse/HDFS-7950 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7950.00.patch, HDFS-7950.01.patch, HDFS-7950.02.patch The test should use Iterables.elementsEqual() instead of Junit AssertEquals to compare two object list. I will post a patch shortly. {code} testAddVolumes(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl) Time elapsed: 0.116 sec FAILURE! java.lang.AssertionError: expected:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData1, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData0] but was:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData0, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData1] at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumes(TestFsDatasetImpl.java:165) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7950) Fix TestFsDatasetImpl#testAddVolumes failure on Windows
[ https://issues.apache.org/jira/browse/HDFS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7950: - Status: Patch Available (was: Open) Fix TestFsDatasetImpl#testAddVolumes failure on Windows --- Key: HDFS-7950 URL: https://issues.apache.org/jira/browse/HDFS-7950 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7950.00.patch, HDFS-7950.01.patch, HDFS-7950.02.patch The test should use Iterables.elementsEqual() instead of Junit AssertEquals to compare two object list. I will post a patch shortly. {code} testAddVolumes(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl) Time elapsed: 0.116 sec FAILURE! java.lang.AssertionError: expected:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData1, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData0] but was:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData0, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData1] at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumes(TestFsDatasetImpl.java:165) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7891: Attachment: (was: HDFS-7891.003.patch) A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7951) Fix NPE for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux
[ https://issues.apache.org/jira/browse/HDFS-7951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7951: - Attachment: HDFS-7951.00.patch On non-Windows OS, the 2nd mockito parameter of prepareVolume does not match the badDir of non-absolute format, resulting a null volumebuilder being used during addVolume(). Fix NPE for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux Key: HDFS-7951 URL: https://issues.apache.org/jira/browse/HDFS-7951 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7951.00.patch {code} acquired by nodename 17386@HW11217.local 2015-03-18 01:07:06,036 WARN common.Util (Util.java:stringAsURI(56)) - Path target/test/data/iFEYz7UKAU/bad should be specified as a URI in configuration files. Please update hdfs configuration. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addVolume(FsDatasetImpl.java:403) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumeFailureReleasesInUseLock(TestFsDatasetImpl.java:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7951) Fix NPE for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux
[ https://issues.apache.org/jira/browse/HDFS-7951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-7951: - Status: Patch Available (was: Open) Fix NPE for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux Key: HDFS-7951 URL: https://issues.apache.org/jira/browse/HDFS-7951 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7951.00.patch {code} acquired by nodename 17386@HW11217.local 2015-03-18 01:07:06,036 WARN common.Util (Util.java:stringAsURI(56)) - Path target/test/data/iFEYz7UKAU/bad should be specified as a URI in configuration files. Please update hdfs configuration. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addVolume(FsDatasetImpl.java:403) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumeFailureReleasesInUseLock(TestFsDatasetImpl.java:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7891: Attachment: HDFS-7891.003.patch A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7891: Attachment: (was: testresult.txt) A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Walter Su updated HDFS-7891: Attachment: testresult.txt A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7617) Add editlog transactions for EC
[ https://issues.apache.org/jira/browse/HDFS-7617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366788#comment-14366788 ] Hui Zheng commented on HDFS-7617: - Hi Jing Are you going to move the dataBlockNum and parityBlockNum from BlockInfoStriped to the XAttr of INode? If we can get the informations from the XAttr, it will be able to save much memory instead of saving them per BlockInfoStriped. Add editlog transactions for EC --- Key: HDFS-7617 URL: https://issues.apache.org/jira/browse/HDFS-7617 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Hui Zheng -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-7617) Add editlog transactions for EC
[ https://issues.apache.org/jira/browse/HDFS-7617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-7617 started by Hui Zheng. --- Add editlog transactions for EC --- Key: HDFS-7617 URL: https://issues.apache.org/jira/browse/HDFS-7617 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Hui Zheng -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7891) A block placement policy with best fault tolerance
[ https://issues.apache.org/jira/browse/HDFS-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366900#comment-14366900 ] Walter Su commented on HDFS-7891: - The previous testresult.txt is wrong. I simply let maxNodesPerRack=Math.ceil(X/R), If choosing 14 targets from 13 racks, maxNodesPerRack will be 2, and the number of invocation of random is much less. It looks fast, but it's wrong. I test again with {{ HDFS-7891.003.patch}} , and the result is correct, but not good. So I'm re-considering sorted rack method. A block placement policy with best fault tolerance -- Key: HDFS-7891 URL: https://issues.apache.org/jira/browse/HDFS-7891 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7891.002.patch, HDFS-7891.003.patch, HDFS-7891.patch, PlacementPolicyBenchmark.txt, testresult.txt a block placement policy tries its best to place replicas to most racks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with +
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366890#comment-14366890 ] Hadoop QA commented on HDFS-7816: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12704261/HDFS-7816.002.patch against trunk revision 5b322c6. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestHarFileSystemWithHA The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestCrcCorruption Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9954//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9954//console This message is automatically generated. Unable to open webhdfs paths with + - Key: HDFS-7816 URL: https://issues.apache.org/jira/browse/HDFS-7816 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.7.0 Reporter: Jason Lowe Assignee: Haohui Mai Priority: Blocker Attachments: HDFS-7816.002.patch, HDFS-7816.patch, HDFS-7816.patch webhdfs requests to open files with % characters in the filename fail because the filename is not being decoded properly. For example: $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366992#comment-14366992 ] Hudson commented on HDFS-7946: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/]) HDFS-7946. TestDataNodeVolumeFailureReporting NPE on Windows. (Contributed by Xiaoyu Yao) (arp: rev 5b322c6a823208bbc64698379340343a72e8160a) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java TestDataNodeVolumeFailureReporting NPE on Windows - Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7940) Add tracing to DFSClient#setQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367002#comment-14367002 ] Hudson commented on HDFS-7940: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #136 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/136/]) HDFS-7940. Add tracing to DFSClient#setQuotaByStorageType (Rakesh R via Colin P. McCabe) (cmccabe: rev d8846707c58c5c3ec542128df13a82ddc05fb347) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add tracing to DFSClient#setQuotaByStorageType -- Key: HDFS-7940 URL: https://issues.apache.org/jira/browse/HDFS-7940 Project: Hadoop HDFS Issue Type: Sub-task Components: dfsclient Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.7.0 Attachments: HDFS-7940-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7951) Fix NPE for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux
[ https://issues.apache.org/jira/browse/HDFS-7951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367029#comment-14367029 ] Hadoop QA commented on HDFS-7951: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705300/HDFS-7951.00.patch against trunk revision 3411732. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9956//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9956//console This message is automatically generated. Fix NPE for TestFsDatasetImpl#testAddVolumeFailureReleasesInUseLock on Linux Key: HDFS-7951 URL: https://issues.apache.org/jira/browse/HDFS-7951 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7951.00.patch {code} acquired by nodename 17386@HW11217.local 2015-03-18 01:07:06,036 WARN common.Util (Util.java:stringAsURI(56)) - Path target/test/data/iFEYz7UKAU/bad should be specified as a URI in configuration files. Please update hdfs configuration. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addVolume(FsDatasetImpl.java:403) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumeFailureReleasesInUseLock(TestFsDatasetImpl.java:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7934) During Rolling upgrade rollback ,standby namenode startup fails.
[ https://issues.apache.org/jira/browse/HDFS-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] J.Andreina updated HDFS-7934: - Attachment: HDFS-7934.1.patch Hi Vinayakumar B , Thanks for your comments. Uploaded an initial patch as per your suggestion and verified locally ( Standby Namenode Startup is successful ) . Please review the patch. During Rolling upgrade rollback ,standby namenode startup fails. Key: HDFS-7934 URL: https://issues.apache.org/jira/browse/HDFS-7934 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical Attachments: HDFS-7934.1.patch During Rolling upgrade rollback , standby namenode startup fails , while loading edits and when there is no local copy of edits created after upgrade ( which is already been removed by Active Namenode from journal manager and from Active's local). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7940) Add tracing to DFSClient#setQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367083#comment-14367083 ] Hudson commented on HDFS-7940: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2068 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2068/]) HDFS-7940. Add tracing to DFSClient#setQuotaByStorageType (Rakesh R via Colin P. McCabe) (cmccabe: rev d8846707c58c5c3ec542128df13a82ddc05fb347) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add tracing to DFSClient#setQuotaByStorageType -- Key: HDFS-7940 URL: https://issues.apache.org/jira/browse/HDFS-7940 Project: Hadoop HDFS Issue Type: Sub-task Components: dfsclient Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.7.0 Attachments: HDFS-7940-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7261) storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState()
[ https://issues.apache.org/jira/browse/HDFS-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367115#comment-14367115 ] Hadoop QA commented on HDFS-7261: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705313/HDFS-7261-002.patch against trunk revision 3411732. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9957//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9957//console This message is automatically generated. storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState() --- Key: HDFS-7261 URL: https://issues.apache.org/jira/browse/HDFS-7261 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Brahma Reddy Battula Attachments: HDFS-7261-001.patch, HDFS-7261-002.patch, HDFS-7261.patch Here is the code: {code} failedStorageInfos = new HashSetDatanodeStorageInfo( storageMap.values()); {code} In other places, the lock on DatanodeDescriptor.storageMap is held: {code} synchronized (storageMap) { final CollectionDatanodeStorageInfo storages = storageMap.values(); return storages.toArray(new DatanodeStorageInfo[storages.size()]); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7940) Add tracing to DFSClient#setQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367014#comment-14367014 ] Hudson commented on HDFS-7940: -- FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/870/]) HDFS-7940. Add tracing to DFSClient#setQuotaByStorageType (Rakesh R via Colin P. McCabe) (cmccabe: rev d8846707c58c5c3ec542128df13a82ddc05fb347) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add tracing to DFSClient#setQuotaByStorageType -- Key: HDFS-7940 URL: https://issues.apache.org/jira/browse/HDFS-7940 Project: Hadoop HDFS Issue Type: Sub-task Components: dfsclient Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.7.0 Attachments: HDFS-7940-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7950) Fix TestFsDatasetImpl#testAddVolumes failure on Windows
[ https://issues.apache.org/jira/browse/HDFS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367022#comment-14367022 ] Hadoop QA commented on HDFS-7950: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705296/HDFS-7950.02.patch against trunk revision 3411732. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestCrcCorruption org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9955//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9955//console This message is automatically generated. Fix TestFsDatasetImpl#testAddVolumes failure on Windows --- Key: HDFS-7950 URL: https://issues.apache.org/jira/browse/HDFS-7950 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7950.00.patch, HDFS-7950.01.patch, HDFS-7950.02.patch The test should use Iterables.elementsEqual() instead of Junit AssertEquals to compare two object list. I will post a patch shortly. {code} testAddVolumes(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl) Time elapsed: 0.116 sec FAILURE! java.lang.AssertionError: expected:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData1, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData0] but was:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData0, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData1] at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumes(TestFsDatasetImpl.java:165) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7934) During Rolling upgrade rollback ,standby namenode startup fails.
[ https://issues.apache.org/jira/browse/HDFS-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] J.Andreina updated HDFS-7934: - Status: Patch Available (was: Open) During Rolling upgrade rollback ,standby namenode startup fails. Key: HDFS-7934 URL: https://issues.apache.org/jira/browse/HDFS-7934 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical Attachments: HDFS-7934.1.patch During Rolling upgrade rollback , standby namenode startup fails , while loading edits and when there is no local copy of edits created after upgrade ( which is already been removed by Active Namenode from journal manager and from Active's local). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367004#comment-14367004 ] Hudson commented on HDFS-7946: -- FAILURE: Integrated in Hadoop-Yarn-trunk #870 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/870/]) HDFS-7946. TestDataNodeVolumeFailureReporting NPE on Windows. (Contributed by Xiaoyu Yao) (arp: rev 5b322c6a823208bbc64698379340343a72e8160a) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java TestDataNodeVolumeFailureReporting NPE on Windows - Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7952) On starting Standby with rollback option, lastPromisedEpoch gets updated and Active Namenode is shutting down.
J.Andreina created HDFS-7952: Summary: On starting Standby with rollback option, lastPromisedEpoch gets updated and Active Namenode is shutting down. Key: HDFS-7952 URL: https://issues.apache.org/jira/browse/HDFS-7952 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical Step 1: Start NN1 as active , NN2 as standby . Step 2: Perform hdfs dfsadmin -rollingUpgrade prepare Step 3: Start NN2 active and NN1 as standby with rolling upgrade started option. Step 4: DN also restarted in upgrade mode and write files to hdfs Step 5: Stop both Namenode and DN Step 6: Restart NN2 as active and NN1 as standby with rolling upgrade rollback option. Issue: = On restarting NN1 as standby with rollback option , lastPromisedEpoch gets updated and active NN2 is shutting down with following exception. {noformat} 15/03/18 16:25:56 FATAL namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [XXX:8485, YYY:8485], stream=QuorumOutputStream starting at txid 22)) org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/2. 2 exceptions thrown: XXX:8485: IPC's epoch 5 is less than the last promised epoch 6 at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:418) at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:446) at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:341) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7953) Browse Directory can't work if path contains # in Namenode web UI
[ https://issues.apache.org/jira/browse/HDFS-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367196#comment-14367196 ] kanaka kumar avvaru commented on HDFS-7953: --- In {{explorer.js}} file {code} browse_directory(dir){ ... var url = '/webhdfs/v1' + dir + '?op=LISTSTATUS' ...} {code} The directory path in URL is not encoded which is causing the error from server side. I am submitting the patch to encode path with same function used to fix {{HDFS-6662}} Browse Directory can't work if path contains # in Namenode web UI - Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Attachments: hdfs-7953.01.patch 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7953) Browse Directory can't work if path contains # in Namenode web UI
[ https://issues.apache.org/jira/browse/HDFS-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru updated HDFS-7953: -- Status: Patch Available (was: Open) Browse Directory can't work if path contains # in Namenode web UI - Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Attachments: hdfs-7953.01.patch 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6658) Namenode memory optimization - Block replicas list
[ https://issues.apache.org/jira/browse/HDFS-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367179#comment-14367179 ] Daryn Sharp commented on HDFS-6658: --- Would anyone comment on the approach? We are eager to begin stress testing if the approach is reasonable. Namenode memory optimization - Block replicas list --- Key: HDFS-6658 URL: https://issues.apache.org/jira/browse/HDFS-6658 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.4.1 Reporter: Amir Langer Assignee: Daryn Sharp Attachments: BlockListOptimizationComparison.xlsx, BlocksMap redesign.pdf, HDFS-6658.patch, HDFS-6658.patch, HDFS-6658.patch, Namenode Memory Optimizations - Block replicas list.docx, New primative indexes.jpg, Old triplets.jpg Part of the memory consumed by every BlockInfo object in the Namenode is a linked list of block references for every DatanodeStorageInfo (called triplets). We propose to change the way we store the list in memory. Using primitive integer indexes instead of object references will reduce the memory needed for every block replica (when compressed oops is disabled) and in our new design the list overhead will be per DatanodeStorageInfo and not per block replica. see attached design doc. for details and evaluation results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7953) Browse Directory can't work if path contains # in Namenode web UI
kanaka kumar avvaru created HDFS-7953: - Summary: Browse Directory can't work if path contains # in Namenode web UI Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7953) Browse Directory can't work if path contains # in Namenode web UI
[ https://issues.apache.org/jira/browse/HDFS-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru updated HDFS-7953: -- Attachment: hdfs-7953.01.patch Browse Directory can't work if path contains # in Namenode web UI - Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Attachments: hdfs-7953.01.patch 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7940) Add tracing to DFSClient#setQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367248#comment-14367248 ] Hudson commented on HDFS-7940: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #127 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/127/]) HDFS-7940. Add tracing to DFSClient#setQuotaByStorageType (Rakesh R via Colin P. McCabe) (cmccabe: rev d8846707c58c5c3ec542128df13a82ddc05fb347) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java Add tracing to DFSClient#setQuotaByStorageType -- Key: HDFS-7940 URL: https://issues.apache.org/jira/browse/HDFS-7940 Project: Hadoop HDFS Issue Type: Sub-task Components: dfsclient Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.7.0 Attachments: HDFS-7940-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367239#comment-14367239 ] Hudson commented on HDFS-7946: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #127 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/127/]) HDFS-7946. TestDataNodeVolumeFailureReporting NPE on Windows. (Contributed by Xiaoyu Yao) (arp: rev 5b322c6a823208bbc64698379340343a72e8160a) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt TestDataNodeVolumeFailureReporting NPE on Windows - Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367298#comment-14367298 ] Hudson commented on HDFS-7946: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2086 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2086/]) HDFS-7946. TestDataNodeVolumeFailureReporting NPE on Windows. (Contributed by Xiaoyu Yao) (arp: rev 5b322c6a823208bbc64698379340343a72e8160a) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java TestDataNodeVolumeFailureReporting NPE on Windows - Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7928) Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy
[ https://issues.apache.org/jira/browse/HDFS-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7928: - Status: Patch Available (was: Open) Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy -- Key: HDFS-7928 URL: https://issues.apache.org/jira/browse/HDFS-7928 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7928-v1.patch, HDFS-7928.patch We observed this issue in rolling upgrade to 2.6.x on one of our cluster. One of the disks was very busy and it took long time to scan that disk compared to other disks. Seeing the sar (System Activity Reporter) data we saw that the particular disk was very busy performing IO operations. Requesting for an improvement during datanode rolling upgrade. During shutdown, we can persist the whole volume map on the disk and let the datanode read that file and create the volume map during startup after rolling upgrade. This will not require the datanode process to scan all the disk and read the block. This will significantly improve the datanode startup time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7940) Add tracing to DFSClient#setQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367323#comment-14367323 ] Hudson commented on HDFS-7940: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #136 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/136/]) HDFS-7940. Add tracing to DFSClient#setQuotaByStorageType (Rakesh R via Colin P. McCabe) (cmccabe: rev d8846707c58c5c3ec542128df13a82ddc05fb347) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add tracing to DFSClient#setQuotaByStorageType -- Key: HDFS-7940 URL: https://issues.apache.org/jira/browse/HDFS-7940 Project: Hadoop HDFS Issue Type: Sub-task Components: dfsclient Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.7.0 Attachments: HDFS-7940-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7946) TestDataNodeVolumeFailureReporting NPE on Windows
[ https://issues.apache.org/jira/browse/HDFS-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367314#comment-14367314 ] Hudson commented on HDFS-7946: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #136 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/136/]) HDFS-7946. TestDataNodeVolumeFailureReporting NPE on Windows. (Contributed by Xiaoyu Yao) (arp: rev 5b322c6a823208bbc64698379340343a72e8160a) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt TestDataNodeVolumeFailureReporting NPE on Windows - Key: HDFS-7946 URL: https://issues.apache.org/jira/browse/HDFS-7946 Project: Hadoop HDFS Issue Type: Sub-task Components: test Environment: Windows Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7946.00.patch TestDataNodeVolumeFailureReporting has a pre-test setUp that assumeTrue(!Path.WINDOWS) but the post-test tearDown() does not. This triggers NPE when closing cluster. testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! org.junit.internal.AssumptionViolatedException: got: false, expected: is true at org.junit.Assume.assumeThat(Assume.java:95) at org.junit.Assume.assumeTrue(Assume.java:41) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.setUp(TestDataNodeVolumeFailureReporting.java:83) testSuccessiveVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting) Time elapsed: 0.267 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.tearDown(TestDataNodeVolumeFailureReporting.java:103) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7928) Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy
[ https://issues.apache.org/jira/browse/HDFS-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7928: - Attachment: HDFS-7928-v1.patch Addressed all the daryn's comment except for the 8th one. I assume he made that comment considering the layout version change in future. This will make the readReplicasFromFile function blow up and throw an exception. In case of any exception while reading the cache file, this code will fall back to read from the disk. Please review and comment. Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy -- Key: HDFS-7928 URL: https://issues.apache.org/jira/browse/HDFS-7928 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7928-v1.patch, HDFS-7928.patch We observed this issue in rolling upgrade to 2.6.x on one of our cluster. One of the disks was very busy and it took long time to scan that disk compared to other disks. Seeing the sar (System Activity Reporter) data we saw that the particular disk was very busy performing IO operations. Requesting for an improvement during datanode rolling upgrade. During shutdown, we can persist the whole volume map on the disk and let the datanode read that file and create the volume map during startup after rolling upgrade. This will not require the datanode process to scan all the disk and read the block. This will significantly improve the datanode startup time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7929) inotify unable fetch pre-upgrade edit log segments once upgrade starts
[ https://issues.apache.org/jira/browse/HDFS-7929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367453#comment-14367453 ] Zhe Zhang commented on HDFS-7929: - {{TestLazyPersistFiles}} is unrelated and passes locally (in Jenkins job it was killed instead of failed) inotify unable fetch pre-upgrade edit log segments once upgrade starts -- Key: HDFS-7929 URL: https://issues.apache.org/jira/browse/HDFS-7929 Project: Hadoop HDFS Issue Type: Bug Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7929-000.patch, HDFS-7929-001.patch, HDFS-7929-002.patch, HDFS-7929-003.patch inotify is often used to periodically poll HDFS events. However, once an HDFS upgrade has started, edit logs are moved to /previous on the NN, which is not accessible. Moreover, once the upgrade is finalized /previous is currently lost forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7940) Add tracing to DFSClient#setQuotaByStorageType
[ https://issues.apache.org/jira/browse/HDFS-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367307#comment-14367307 ] Hudson commented on HDFS-7940: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2086 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2086/]) HDFS-7940. Add tracing to DFSClient#setQuotaByStorageType (Rakesh R via Colin P. McCabe) (cmccabe: rev d8846707c58c5c3ec542128df13a82ddc05fb347) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Add tracing to DFSClient#setQuotaByStorageType -- Key: HDFS-7940 URL: https://issues.apache.org/jira/browse/HDFS-7940 Project: Hadoop HDFS Issue Type: Sub-task Components: dfsclient Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.7.0 Attachments: HDFS-7940-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7954) TestBalancer#testBalancerWithPinnedBlocks failed on both Linux and Windows
[ https://issues.apache.org/jira/browse/HDFS-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367382#comment-14367382 ] Xiaoyu Yao commented on HDFS-7954: -- The iteration output from balancer does not seem to right. Also, the test should not run on Windows before HDFS-7759 is ready. {code} Iteration LeftToMove BeingMoved Moved Mar 18, 2015 9:02:31 AM 0500 B 2.12 KB 500 B Mar 18, 2015 9:02:35 AM 1 1000 B 1.63 KB 500 B Mar 18, 2015 9:02:39 AM 2 1.46 KB 1.14 KB 500 B Mar 18, 2015 9:02:43 AM 3 1.95 KB 666 B 500 B Mar 18, 2015 9:02:47 AM 4 2.44 KB 166 B 500 B The cluster is balanced. Exiting... Mar 18, 2015 9:02:50 AM 5 2.44 KB 0 B -1 B {code} TestBalancer#testBalancerWithPinnedBlocks failed on both Linux and Windows -- Key: HDFS-7954 URL: https://issues.apache.org/jira/browse/HDFS-7954 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao {code} testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer) Time elapsed: 22.624 sec FAILURE! java.lang.AssertionError: expected:-3 but was:0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:353) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7928) Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy
[ https://issues.apache.org/jira/browse/HDFS-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7928: - Target Version/s: 3.0.0, 2.8.0 (was: 2.8.0) Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy -- Key: HDFS-7928 URL: https://issues.apache.org/jira/browse/HDFS-7928 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.6.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah Attachments: HDFS-7928.patch We observed this issue in rolling upgrade to 2.6.x on one of our cluster. One of the disks was very busy and it took long time to scan that disk compared to other disks. Seeing the sar (System Activity Reporter) data we saw that the particular disk was very busy performing IO operations. Requesting for an improvement during datanode rolling upgrade. During shutdown, we can persist the whole volume map on the disk and let the datanode read that file and create the volume map during startup after rolling upgrade. This will not require the datanode process to scan all the disk and read the block. This will significantly improve the datanode startup time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7934) During Rolling upgrade rollback ,standby namenode startup fails.
[ https://issues.apache.org/jira/browse/HDFS-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367391#comment-14367391 ] Hadoop QA commented on HDFS-7934: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705338/HDFS-7934.1.patch against trunk revision 9d72f93. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9958//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9958//console This message is automatically generated. During Rolling upgrade rollback ,standby namenode startup fails. Key: HDFS-7934 URL: https://issues.apache.org/jira/browse/HDFS-7934 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical Attachments: HDFS-7934.1.patch During Rolling upgrade rollback , standby namenode startup fails , while loading edits and when there is no local copy of edits created after upgrade ( which is already been removed by Active Namenode from journal manager and from Active's local). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7945) The WebHdfs system on DN does not honor the length parameter
[ https://issues.apache.org/jira/browse/HDFS-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367415#comment-14367415 ] Eric Payne commented on HDFS-7945: -- Thanks for implementing the fix for this issue, [~wheat9] +1, Looks Good The WebHdfs system on DN does not honor the length parameter Key: HDFS-7945 URL: https://issues.apache.org/jira/browse/HDFS-7945 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Blocker Attachments: HDFS-7945.000.patch HDFS-7279 introduces a new WebHdfs server on the DN. The new server does not honor the length parameter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7954) TestBalancer#testBalancerWithPinnedBlocks failed on both Linux and Windows
Xiaoyu Yao created HDFS-7954: Summary: TestBalancer#testBalancerWithPinnedBlocks failed on both Linux and Windows Key: HDFS-7954 URL: https://issues.apache.org/jira/browse/HDFS-7954 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao {code} testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer) Time elapsed: 22.624 sec FAILURE! java.lang.AssertionError: expected:-3 but was:0 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:353) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7953) NN Web UI fails to navigate to paths that contain #
[ https://issues.apache.org/jira/browse/HDFS-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7953: - Summary: NN Web UI fails to navigate to paths that contain # (was: Browse Directory can't work if path contains # in Namenode web UI) NN Web UI fails to navigate to paths that contain # --- Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Attachments: hdfs-7953.01.patch 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7953) NN Web UI fails to navigate to paths that contain #
[ https://issues.apache.org/jira/browse/HDFS-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367610#comment-14367610 ] Hadoop QA commented on HDFS-7953: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705354/hdfs-7953.01.patch against trunk revision 9d72f93. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9960//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9960//console This message is automatically generated. NN Web UI fails to navigate to paths that contain # --- Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Fix For: 2.7.0 Attachments: hdfs-7953.01.patch 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7953) NN Web UI fails to navigate to paths that contain #
[ https://issues.apache.org/jira/browse/HDFS-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7953: - Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk, branch-2 and branch-2.7. Thanks [~kanaka] for the contribution. NN Web UI fails to navigate to paths that contain # --- Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Fix For: 2.7.0 Attachments: hdfs-7953.01.patch 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7054) Make DFSOutputStream tracing more fine-grained
[ https://issues.apache.org/jira/browse/HDFS-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367512#comment-14367512 ] stack commented on HDFS-7054: - Patch LGTM. A few questions: When would parents come back empty? 435 long parents[] = one.getTraceParents(); 436 if (parents.length 0) { In hsync, the no arg method calls the method with an Enum and both get scope and close when done. Could be a double-close. That is ok? We do it in both methods because either could be called? A test for the manipulation in getTraceParents? Make DFSOutputStream tracing more fine-grained -- Key: HDFS-7054 URL: https://issues.apache.org/jira/browse/HDFS-7054 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: 2.6.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-7054.001.patch, HDFS-7054.002.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7953) Browse Directory can't work if path contains # in Namenode web UI
[ https://issues.apache.org/jira/browse/HDFS-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367566#comment-14367566 ] Haohui Mai commented on HDFS-7953: -- LGTM. +1. I'll commit it shortly. Browse Directory can't work if path contains # in Namenode web UI - Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Attachments: hdfs-7953.01.patch 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7953) NN Web UI fails to navigate to paths that contain #
[ https://issues.apache.org/jira/browse/HDFS-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367600#comment-14367600 ] Hudson commented on HDFS-7953: -- FAILURE: Integrated in Hadoop-trunk-Commit #7358 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7358/]) HDFS-7953. NN Web UI fails to navigate to paths that contain #. Contributed by kanaka kumar avvaru. (wheat9: rev 402817cd9b786455c9b885345c5fbb178acd244b) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js NN Web UI fails to navigate to paths that contain # --- Key: HDFS-7953 URL: https://issues.apache.org/jira/browse/HDFS-7953 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Fix For: 2.7.0 Attachments: hdfs-7953.01.patch 1. Create a directory with name containing '#' like test#folder 2. Add a file under this folder 3. Use NameNode UI to browse the folder It gives the following error in web UI Failed to retrieve data from /webhdfs/v1/test#folder?op=LISTSTATUS: Bad Request -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7955) Improve naming of classes, methods, and variables related to block replication and recovery
Zhe Zhang created HDFS-7955: --- Summary: Improve naming of classes, methods, and variables related to block replication and recovery Key: HDFS-7955 URL: https://issues.apache.org/jira/browse/HDFS-7955 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Many existing names should be revised to avoid confusion when blocks can be both replicated and erasure coded. This JIRA aims to solicit opinions on making those names more consistent and intuitive. # In current HDFS _block recovery_ refers to the process of finalizing the last block of a file, triggered by _lease recovery_. It is different from the intuitive meaning of _recovering a lost block_. To avoid confusion, I can think of 2 options: #* Rename this process as _block finalization_ or _block completion_. I prefer this option because this is literally not a recovery. #* If we want to keep existing terms unchanged we can name all EC recovery and re-replication logics as _reconstruction_. # As Kai [suggested | https://issues.apache.org/jira/browse/HDFS-7369?focusedCommentId=14361131page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14361131] under HDFS-7369, several replication-based names should be made more generic: #* {{UnderReplicatedBlocks}} and {{neededReplications}}. E.g. we can use {{LowRedundancyBlocks}}/{{AtRiskBlocks}}, and {{neededRecovery}}/{{neededReconstruction}}. # {{PendingReplicationBlocks}} # {{ReplicationMonitor}} I'm sure the above list is incomplete; discussions and comments are very welcome. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7863) Missing description of parameter fsd in javadoc
[ https://issues.apache.org/jira/browse/HDFS-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367625#comment-14367625 ] Brahma Reddy Battula commented on HDFS-7863: [~yzhangal] kindly review!! Missing description of parameter fsd in javadoc Key: HDFS-7863 URL: https://issues.apache.org/jira/browse/HDFS-7863 Project: Hadoop HDFS Issue Type: Improvement Reporter: Yongjun Zhang Assignee: Brahma Reddy Battula Priority: Minor Attachments: HDFS-7863.patch HDFS-7573 did refactoring of delete() code. New parameter {{FSDirectory fsd}} is added to resulted methods, but the javadoc is not updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7433) Optimize performance of DatanodeManager's node map
[ https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367579#comment-14367579 ] Hadoop QA commented on HDFS-7433: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12703937/HDFS-7433.patch against trunk revision 9d72f93. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9959//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9959//console This message is automatically generated. Optimize performance of DatanodeManager's node map -- Key: HDFS-7433 URL: https://issues.apache.org/jira/browse/HDFS-7433 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch The datanode map is currently a {{TreeMap}}. For many thousands of datanodes, tree lookups are ~10X more expensive than a {{HashMap}}. Insertions and removals are up to 100X more expensive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7955) Improve naming of classes, methods, and variables related to block replication and recovery
[ https://issues.apache.org/jira/browse/HDFS-7955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7955: Description: Many existing names should be revised to avoid confusion when blocks can be both replicated and erasure coded. This JIRA aims to solicit opinions on making those names more consistent and intuitive. # In current HDFS _block recovery_ refers to the process of finalizing the last block of a file, triggered by _lease recovery_. It is different from the intuitive meaning of _recovering a lost block_. To avoid confusion, I can think of 2 options: #* Rename this process as _block finalization_ or _block completion_. I prefer this option because this is literally not a recovery. #* If we want to keep existing terms unchanged we can name all EC recovery and re-replication logics as _reconstruction_. # As Kai [suggested | https://issues.apache.org/jira/browse/HDFS-7369?focusedCommentId=14361131page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14361131] under HDFS-7369, several replication-based names should be made more generic: #* {{UnderReplicatedBlocks}} and {{neededReplications}}. E.g. we can use {{LowRedundancyBlocks}}/{{AtRiskBlocks}}, and {{neededRecovery}}/{{neededReconstruction}}. #* {{PendingReplicationBlocks}} #* {{ReplicationMonitor}} I'm sure the above list is incomplete; discussions and comments are very welcome. was: Many existing names should be revised to avoid confusion when blocks can be both replicated and erasure coded. This JIRA aims to solicit opinions on making those names more consistent and intuitive. # In current HDFS _block recovery_ refers to the process of finalizing the last block of a file, triggered by _lease recovery_. It is different from the intuitive meaning of _recovering a lost block_. To avoid confusion, I can think of 2 options: #* Rename this process as _block finalization_ or _block completion_. I prefer this option because this is literally not a recovery. #* If we want to keep existing terms unchanged we can name all EC recovery and re-replication logics as _reconstruction_. # As Kai [suggested | https://issues.apache.org/jira/browse/HDFS-7369?focusedCommentId=14361131page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14361131] under HDFS-7369, several replication-based names should be made more generic: #* {{UnderReplicatedBlocks}} and {{neededReplications}}. E.g. we can use {{LowRedundancyBlocks}}/{{AtRiskBlocks}}, and {{neededRecovery}}/{{neededReconstruction}}. # {{PendingReplicationBlocks}} # {{ReplicationMonitor}} I'm sure the above list is incomplete; discussions and comments are very welcome. Improve naming of classes, methods, and variables related to block replication and recovery --- Key: HDFS-7955 URL: https://issues.apache.org/jira/browse/HDFS-7955 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Many existing names should be revised to avoid confusion when blocks can be both replicated and erasure coded. This JIRA aims to solicit opinions on making those names more consistent and intuitive. # In current HDFS _block recovery_ refers to the process of finalizing the last block of a file, triggered by _lease recovery_. It is different from the intuitive meaning of _recovering a lost block_. To avoid confusion, I can think of 2 options: #* Rename this process as _block finalization_ or _block completion_. I prefer this option because this is literally not a recovery. #* If we want to keep existing terms unchanged we can name all EC recovery and re-replication logics as _reconstruction_. # As Kai [suggested | https://issues.apache.org/jira/browse/HDFS-7369?focusedCommentId=14361131page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14361131] under HDFS-7369, several replication-based names should be made more generic: #* {{UnderReplicatedBlocks}} and {{neededReplications}}. E.g. we can use {{LowRedundancyBlocks}}/{{AtRiskBlocks}}, and {{neededRecovery}}/{{neededReconstruction}}. #* {{PendingReplicationBlocks}} #* {{ReplicationMonitor}} I'm sure the above list is incomplete; discussions and comments are very welcome. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7037) Using distcp to copy data from insecure to secure cluster via hftp doesn't work (branch-2 only)
[ https://issues.apache.org/jira/browse/HDFS-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367627#comment-14367627 ] Haohui Mai commented on HDFS-7037: -- Thanks for the ping. My position is still the same -- falling back to insecure mode at the filesystem layer unanimously opens up subtle security vulnerabilities. Unfortunately I have both hit the issue and misconfiguration in practice. I have strong preferences not to do so where my reasonings can be found in relevant jiras. As I pointed out in HDFS-6776, you'll need to fix this issue for every single filesystem. I appreciate if you can continue to investigate on doing it in distcp. I'll comment on HDFS-7036 later today. Using distcp to copy data from insecure to secure cluster via hftp doesn't work (branch-2 only) Key: HDFS-7037 URL: https://issues.apache.org/jira/browse/HDFS-7037 Project: Hadoop HDFS Issue Type: Bug Components: security, tools Affects Versions: 2.6.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Attachments: HDFS-7037.001.patch This is a branch-2 only issue since hftp is only supported there. Issuing distcp hftp://insecureCluster hdfs://secureCluster gave the following failure exception: {code} 14/09/13 22:07:40 INFO tools.DelegationTokenFetcher: Error when dealing remote token: java.io.IOException: Error when dealing remote token: Internal Server Error at org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.run(DelegationTokenFetcher.java:375) at org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.getDTfromRemote(DelegationTokenFetcher.java:238) at org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:252) at org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:247) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.HftpFileSystem.getDelegationToken(HftpFileSystem.java:247) at org.apache.hadoop.hdfs.web.TokenAspect.ensureTokenInitialized(TokenAspect.java:140) at org.apache.hadoop.hdfs.web.HftpFileSystem.addDelegationTokenParam(HftpFileSystem.java:337) at org.apache.hadoop.hdfs.web.HftpFileSystem.openConnection(HftpFileSystem.java:324) at org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:457) at org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:472) at org.apache.hadoop.hdfs.web.HftpFileSystem.getFileStatus(HftpFileSystem.java:501) at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57) at org.apache.hadoop.fs.Globber.glob(Globber.java:248) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623) at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:77) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:81) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:342) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:154) at org.apache.hadoop.tools.DistCp.run(DistCp.java:121) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.tools.DistCp.main(DistCp.java:390) 14/09/13 22:07:40 WARN security.UserGroupInformation: PriviledgedActionException as:hadoopu...@xyz.com (auth:KERBEROS) cause:java.io.IOException: Unable to obtain remote token 14/09/13 22:07:40 ERROR tools.DistCp: Exception encountered java.io.IOException: Unable to obtain remote token at org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.getDTfromRemote(DelegationTokenFetcher.java:249) at org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:252) at org.apache.hadoop.hdfs.web.HftpFileSystem$2.run(HftpFileSystem.java:247) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554) at org.apache.hadoop.hdfs.web.HftpFileSystem.getDelegationToken(HftpFileSystem.java:247) at org.apache.hadoop.hdfs.web.TokenAspect.ensureTokenInitialized(TokenAspect.java:140) at org.apache.hadoop.hdfs.web.HftpFileSystem.addDelegationTokenParam(HftpFileSystem.java:337) at org.apache.hadoop.hdfs.web.HftpFileSystem.openConnection(HftpFileSystem.java:324) at org.apache.hadoop.hdfs.web.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:457) at
[jira] [Commented] (HDFS-7697) Document the scope of the PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367825#comment-14367825 ] Hadoop QA commented on HDFS-7697: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12704703/HDFS-7697.000.patch against trunk revision 9d72f93. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestFileLengthOnClusterRestart Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9962//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9962//console This message is automatically generated. Document the scope of the PB OIV tool - Key: HDFS-7697 URL: https://issues.apache.org/jira/browse/HDFS-7697 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Lei (Eddy) Xu Attachments: HDFS-7697.000.patch As par HDFS-6673, we need to document the applicable scope of the new PB OIV tool so that it won't catch users by surprise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7697) Document the scope of the PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367830#comment-14367830 ] Haohui Mai commented on HDFS-7697: -- +1. I'll commit shortly. Document the scope of the PB OIV tool - Key: HDFS-7697 URL: https://issues.apache.org/jira/browse/HDFS-7697 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Lei (Eddy) Xu Attachments: HDFS-7697.000.patch As par HDFS-6673, we need to document the applicable scope of the new PB OIV tool so that it won't catch users by surprise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files
[ https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367831#comment-14367831 ] Zhe Zhang commented on HDFS-7853: - In HDFS-7369 we are trying to find indices of missing blocks in a group (in {{BlockMananger#chooseSourceDatanodes}}. The current logic is to iterate through all the block's storages, for each storage, find its index through {{BlockInfoStriped#getStorageBlockIndex}}, deduct those and find out which ones are missing. I think this operation would be more efficient if we use sentinels. [~jingzhao] What's your thought on this? Erasure coding: extend LocatedBlocks to support reading from striped files -- Key: HDFS-7853 URL: https://issues.apache.org/jira/browse/HDFS-7853 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Jing Zhao Fix For: HDFS-7285 Attachments: HDFS-7853.000.patch, HDFS-7853.001.patch, HDFS-7853.002.patch We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work with striping layout (possibly an extra list specifying the index of each location in the group) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7697) Mark the PB OIV tool as experimental
[ https://issues.apache.org/jira/browse/HDFS-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7697: - Summary: Mark the PB OIV tool as experimental (was: Document the scope of the PB OIV tool) Mark the PB OIV tool as experimental Key: HDFS-7697 URL: https://issues.apache.org/jira/browse/HDFS-7697 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Lei (Eddy) Xu Attachments: HDFS-7697.000.patch As par HDFS-6673, we need to document the applicable scope of the new PB OIV tool so that it won't catch users by surprise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7697) Mark the PB OIV tool as experimental
[ https://issues.apache.org/jira/browse/HDFS-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7697: - Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk and branch-2. Thanks [~eddyxu] for taking care of this. Mark the PB OIV tool as experimental Key: HDFS-7697 URL: https://issues.apache.org/jira/browse/HDFS-7697 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Lei (Eddy) Xu Fix For: 2.7.0 Attachments: HDFS-7697.000.patch As par HDFS-6673, we need to document the applicable scope of the new PB OIV tool so that it won't catch users by surprise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7697) Mark the PB OIV tool as experimental
[ https://issues.apache.org/jira/browse/HDFS-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367838#comment-14367838 ] Lei (Eddy) Xu commented on HDFS-7697: - Thanks a lot for reviewing and committing this, [~wheat9]. Mark the PB OIV tool as experimental Key: HDFS-7697 URL: https://issues.apache.org/jira/browse/HDFS-7697 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Lei (Eddy) Xu Fix For: 2.7.0 Attachments: HDFS-7697.000.patch As par HDFS-6673, we need to document the applicable scope of the new PB OIV tool so that it won't catch users by surprise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files
[ https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367848#comment-14367848 ] Jing Zhao commented on HDFS-7853: - Can we record the index of healthy blocks for the recovery command in HDFS-7369, instead of recording the missing blocks? Then we only need to iterate through all the storages. Erasure coding: extend LocatedBlocks to support reading from striped files -- Key: HDFS-7853 URL: https://issues.apache.org/jira/browse/HDFS-7853 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Jing Zhao Fix For: HDFS-7285 Attachments: HDFS-7853.000.patch, HDFS-7853.001.patch, HDFS-7853.002.patch We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work with striping layout (possibly an extra list specifying the index of each location in the group) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7697) Mark the PB OIV tool as experimental
[ https://issues.apache.org/jira/browse/HDFS-7697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367853#comment-14367853 ] Hudson commented on HDFS-7697: -- FAILURE: Integrated in Hadoop-trunk-Commit #7363 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7363/]) HDFS-7697. Mark the PB OIV tool as experimental. Contributed by Lei (Eddy) Xu. (wheat9: rev f0dea037ffa5ffdb0a3d58f806ee313f2998f781) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewerPB.java * hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Mark the PB OIV tool as experimental Key: HDFS-7697 URL: https://issues.apache.org/jira/browse/HDFS-7697 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Lei (Eddy) Xu Fix For: 2.7.0 Attachments: HDFS-7697.000.patch As par HDFS-6673, we need to document the applicable scope of the new PB OIV tool so that it won't catch users by surprise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7948) TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows
[ https://issues.apache.org/jira/browse/HDFS-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367723#comment-14367723 ] Arpit Agarwal commented on HDFS-7948: - +1 for the patch. I verified it fixes the test failure on Windows. I will commit it shortly. TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows -- Key: HDFS-7948 URL: https://issues.apache.org/jira/browse/HDFS-7948 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Attachments: HDFS-7948.00.patch The failure below is due to File#getCanonicalPath() does not work with File object created with URI format (file:/c:/users/xyz/test/data/dfs/data/new_vol1) on Windows. I will post a fix shortly. {code} testAddVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes) Time elapsed: 5.746 sec ERROR! java.io.IOException: The filename, directory name, or volume label syntax is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:414) at java.io.File.getCanonicalPath(File.java:589) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumeFailures(TestDataNodeHotSwapVolumes.java:525) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7948) TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows
[ https://issues.apache.org/jira/browse/HDFS-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7948: Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk through branch-2.7. Thanks for the contribution [~xyao]. TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows -- Key: HDFS-7948 URL: https://issues.apache.org/jira/browse/HDFS-7948 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7948.00.patch The failure below is due to File#getCanonicalPath() does not work with File object created with URI format (file:/c:/users/xyz/test/data/dfs/data/new_vol1) on Windows. I will post a fix shortly. {code} testAddVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes) Time elapsed: 5.746 sec ERROR! java.io.IOException: The filename, directory name, or volume label syntax is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:414) at java.io.File.getCanonicalPath(File.java:589) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumeFailures(TestDataNodeHotSwapVolumes.java:525) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7948) TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows
[ https://issues.apache.org/jira/browse/HDFS-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14367737#comment-14367737 ] Hudson commented on HDFS-7948: -- FAILURE: Integrated in Hadoop-trunk-Commit #7359 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7359/]) HDFS-7948. TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows. (Contributed by Xiaoyu Yao) (arp: rev fdd58aa32f3e007b9e8bebe655b4c05e39b63110) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeHotSwapVolumes.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt TestDataNodeHotSwapVolumes#testAddVolumeFailures failed on Windows -- Key: HDFS-7948 URL: https://issues.apache.org/jira/browse/HDFS-7948 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7948.00.patch The failure below is due to File#getCanonicalPath() does not work with File object created with URI format (file:/c:/users/xyz/test/data/dfs/data/new_vol1) on Windows. I will post a fix shortly. {code} testAddVolumeFailures(org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes) Time elapsed: 5.746 sec ERROR! java.io.IOException: The filename, directory name, or volume label syntax is incorrect at java.io.WinNTFileSystem.canonicalize0(Native Method) at java.io.Win32FileSystem.canonicalize(Win32FileSystem.java:414) at java.io.File.getCanonicalPath(File.java:589) at org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testAddVolumeFailures(TestDataNodeHotSwapVolumes.java:525) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7950) Fix TestFsDatasetImpl#testAddVolumes failure on Windows
[ https://issues.apache.org/jira/browse/HDFS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7950: Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk through branch-2.7. Thanks for the contribution [~xyao]. Fix TestFsDatasetImpl#testAddVolumes failure on Windows --- Key: HDFS-7950 URL: https://issues.apache.org/jira/browse/HDFS-7950 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.6.0 Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao Fix For: 2.7.0 Attachments: HDFS-7950.00.patch, HDFS-7950.01.patch, HDFS-7950.02.patch The test should use Iterables.elementsEqual() instead of Junit AssertEquals to compare two object list. I will post a patch shortly. {code} testAddVolumes(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl) Time elapsed: 0.116 sec FAILURE! java.lang.AssertionError: expected:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData1, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\Y8839Td1WC\newData0] but was:[D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData0, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData2, D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target/test/data/Y8839Td1WC/newData1] at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testAddVolumes(TestFsDatasetImpl.java:165) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6515) testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache)
[ https://issues.apache.org/jira/browse/HDFS-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366742#comment-14366742 ] Ayappan commented on HDFS-6515: --- Above testcases are unrelated to this patch testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) - Key: HDFS-6515 URL: https://issues.apache.org/jira/browse/HDFS-6515 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.4.0, 2.4.1 Environment: Linux on PPC64 Tested with Hadoop 3.0.0 SNAPSHOT, on RHEL 6.5, on Ubuntu 14.04, on Fedora 19, using mvn -Dtest=TestFsDatasetCache#testPageRounder -X test Reporter: Tony Reix Priority: Blocker Labels: hadoop, test Attachments: HDFS-6515-1.patch, HDFS-6515-2.patch I have an issue with test : testPageRounder (org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) on Linux/PowerPC. On Linux/Intel, test runs fine. On Linux/PowerPC, I have: testPageRounder(org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache) Time elapsed: 64.037 sec ERROR! java.lang.Exception: test timed out after 6 milliseconds Looking at details, I see that some Failed to cache messages appear in the traces. Only 10 on Intel, but 186 on PPC64. On PPC64, it looks like some thread is waiting for something that never happens, generating a TimeOut. I'm now using IBM JVM, however I've just checked that the issue also appears with OpenJDK. I'm now using Hadoop latest, however, the issue appeared within Hadoop 2.4.0 . I need help for understanding what the test is doing, what traces are expected, in order to understand what/where is the root cause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366745#comment-14366745 ] Fengdong Yu commented on HDFS-7285: --- [~zhz], Can you explain how to run your Python code? you don't have parameter specification. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366746#comment-14366746 ] Fengdong Yu commented on HDFS-7285: --- [~zhz], Can you explain how to run your Python code? you don't have parameter specification. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366753#comment-14366753 ] Fengdong Yu commented on HDFS-7285: --- Wow, why shows lots of repeated comments here? Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366749#comment-14366749 ] Fengdong Yu commented on HDFS-7285: --- [~zhz], Can you explain how to run your Python code? you don't have parameter specification. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366750#comment-14366750 ] Fengdong Yu commented on HDFS-7285: --- [~zhz], Can you explain how to run your Python code? you don't have parameter specification. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366751#comment-14366751 ] Fengdong Yu commented on HDFS-7285: --- [~zhz], Can you explain how to run your Python code? you don't have parameter specification. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366747#comment-14366747 ] Fengdong Yu commented on HDFS-7285: --- [~zhz], Can you explain how to run your Python code? you don't have parameter specification. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366748#comment-14366748 ] Fengdong Yu commented on HDFS-7285: --- [~zhz], Can you explain how to run your Python code? you don't have parameter specification. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7285: Comment: was deleted (was: [~zhz], Can you explain how to run your Python code? you don't have parameter specification. ) Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7369) Erasure coding: distribute recovery work for striped blocks to DataNode
[ https://issues.apache.org/jira/browse/HDFS-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366760#comment-14366760 ] Jing Zhao commented on HDFS-7369: - Some questions and comments: # A question on BlockECRecoveryInfo and BlockECRecoveryCommand: for a (m,k) EC schema, I guess a DN needs to know these information for recovery: 1) block ID and block pool ID, 2) new generation stamp, 3) DNs with healthy blocks and their corresponding index, 4) target DNs and storages, and 5) EC schema information. Looks like we currently do not have 2) and 5). # In BlockECRecoveryCommand, can we use DatanodeStorageInfo directly instead of separating it into DatanodeInfo, StorageType, and storage ID? Also, these blocks for recovery may not belong to the same block pool. Thus we may need to use ExtendedBlock here. # Minor: for BlockECRecoveryCommand, instead of using two dimension arrays, it may be more clear to use a structure like {{BlockRecoveryCommand}}, i.e., to wrap information for one striped block into a single class. # Minor: need to fix the javadoc of BlockRecoveryWork (srcNodes and targets cannot be resolved). Also need to fix javadoc for the BlockECRecoveryCommand class. # Minor: {{BlockManager#chooseSourceDatanodes}}'s javadoc needs reformat. Also {{missingBlockIndices}}'s javadoc is missing. # Nit-pick: the change of imports in DatanodeDescriptor is unnecessary Erasure coding: distribute recovery work for striped blocks to DataNode --- Key: HDFS-7369 URL: https://issues.apache.org/jira/browse/HDFS-7369 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7369-000-part1.patch, HDFS-7369-000-part2.patch, HDFS-7369-001.patch, HDFS-7369-002.patch, HDFS-7369-003.patch This JIRA updates NameNode to handle background / offline recovery of erasure coded blocks. It includes 2 parts: # Extend {{UnderReplicatedBlocks}} to recognize EC blocks and insert them to appropriate priority levels. # Update {{ReplicationMonitor}} to distinguish block codec tasks and send a new DataNode command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7285: Comment: was deleted (was: [~zhz], Can you explain how to run your Python code? you don't have parameter specification. ) Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7285: Comment: was deleted (was: [~zhz], Can you explain how to run your Python code? you don't have parameter specification. ) Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7285: Comment: was deleted (was: [~zhz], Can you explain how to run your Python code? you don't have parameter specification. ) Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7285: Comment: was deleted (was: [~zhz], Can you explain how to run your Python code? you don't have parameter specification. ) Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7285: Comment: was deleted (was: [~zhz], Can you explain how to run your Python code? you don't have parameter specification. ) Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7617) Add editlog transactions for EC
[ https://issues.apache.org/jira/browse/HDFS-7617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366768#comment-14366768 ] Hui Zheng commented on HDFS-7617: - [HDFS-7749|https://issues.apache.org/jira/browse/HDFS-7749] has already added the functionality in editlog for striped blocks.But it doesn't serialize the dataBlockNum and parityBlockNum . So we need to add it. Add editlog transactions for EC --- Key: HDFS-7617 URL: https://issues.apache.org/jira/browse/HDFS-7617 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Hui Zheng -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7881) TestHftpFileSystem#testSeek fails in branch-2
[ https://issues.apache.org/jira/browse/HDFS-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366774#comment-14366774 ] Akira AJISAKA commented on HDFS-7881: - Thanks [~brahmareddy] for creating the patch! Two high-level comments: * Would you create a method to get the stream length for improving readability for {{#openInputStream}}? I'd like to make the code as follows: {code} } else { long streamLength = getStreamLength(connection); fileLength = startPos + streamLength; // Java has a bug... in = ... } (snip) } private static long getStreamLength(HttpURLConnection connection) throws IOException { ... } {code} * The code for parsing the range can throw {{NumberFormatException}} and {{ArrayIndexOutOfBoundsException}}, which are unchecked. I'd like to catch these unchecked exception and throw checked {{IOException}} to notify failed to get content length by parsing the content range. Other comments: {code} if (connection.getResponseCode() == 206) { {code} * 206 should be {{HttpStatus.SC_PARTIAL_CONTENT}}. * It would be better to add a comment why we parse the content range. The example is: {code} // Try to get the content length by parsing the content range // because HftpFileSystem does not return the content length // if the content is partial. {code} {code} String[] str = range.substring(6).split([-/]); return Long.parseLong(str[1]) - Long.parseLong(str[0]) + 1; {code} * It would be better to comment what input is expected and how it is parsed. (Hint: The input is expected to be created by {{org.mortbay.jetty.InclusiveByteRange#toHeaderRangeString}}.) TestHftpFileSystem#testSeek fails in branch-2 - Key: HDFS-7881 URL: https://issues.apache.org/jira/browse/HDFS-7881 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Priority: Blocker Attachments: HDFS-7881.patch TestHftpFileSystem#testSeek fails in branch-2. {code} --- T E S T S --- Running org.apache.hadoop.hdfs.web.TestHftpFileSystem Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.201 sec FAILURE! - in org.apache.hadoop.hdfs.web.TestHftpFileSystem testSeek(org.apache.hadoop.hdfs.web.TestHftpFileSystem) Time elapsed: 0.054 sec ERROR! java.io.IOException: Content-Length is missing: {null=[HTTP/1.1 206 Partial Content], Date=[Wed, 04 Mar 2015 05:32:30 GMT, Wed, 04 Mar 2015 05:32:30 GMT], Expires=[Wed, 04 Mar 2015 05:32:30 GMT, Wed, 04 Mar 2015 05:32:30 GMT], Connection=[close], Content-Type=[text/plain; charset=utf-8], Server=[Jetty(6.1.26)], Content-Range=[bytes 7-9/10], Pragma=[no-cache, no-cache], Cache-Control=[no-cache]} at org.apache.hadoop.hdfs.web.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:132) at org.apache.hadoop.hdfs.web.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:104) at org.apache.hadoop.hdfs.web.ByteRangeInputStream.read(ByteRangeInputStream.java:181) at java.io.FilterInputStream.read(FilterInputStream.java:83) at org.apache.hadoop.hdfs.web.TestHftpFileSystem.testSeek(TestHftpFileSystem.java:253) Results : Tests in error: TestHftpFileSystem.testSeek:253 ยป IO Content-Length is missing: {null=[HTTP/1 Tests run: 14, Failures: 0, Errors: 1, Skipped: 0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7617) Add editlog transactions for EC
[ https://issues.apache.org/jira/browse/HDFS-7617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366778#comment-14366778 ] Jing Zhao commented on HDFS-7617: - Hi Hui, while applying an editlog for adding a striped block, the dataBlockNum and parityBlockNum information can be retrieved from the EC schema, which is associated with either the file itself or its parental directory. Thus I do not think we need to serialize them in the editlog. In the current code since the EC schema XAttr work is still in progress, we simply use HdfsConstants.NUM_DATA_BLOCKS and HdfsConstants.NUM_PARITY_BLOCKS for block allocation. Maybe later you can use this jira to update FSEditLogLoader after corresponding code gets committed. Add editlog transactions for EC --- Key: HDFS-7617 URL: https://issues.apache.org/jira/browse/HDFS-7617 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Hui Zheng -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HDFS-7827) Erasure Coding: support striped blocks in non-protobuf fsimage
[ https://issues.apache.org/jira/browse/HDFS-7827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-7827 started by Hui Zheng. --- Erasure Coding: support striped blocks in non-protobuf fsimage -- Key: HDFS-7827 URL: https://issues.apache.org/jira/browse/HDFS-7827 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Hui Zheng Attachments: HDFS-7827.000.patch, HDFS-7827.002.patch, HDFS-7827.003.patch HDFS-7749 only adds code to persist striped blocks to protobuf-based fsimage. We should also add this support to the non-protobuf fsimage since it is still used for use cases like offline image processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7930) commitBlockSynchronization() does not remove locations
[ https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368053#comment-14368053 ] Konstantin Shvachko commented on HDFS-7930: --- Although this will not fix the {{testTruncateWithDataNodesRestart()}} completely. The location is correctly invalidated on the NN, but then NN postpones invalidation on the DN and waits for the next report. {code} 2015-03-18 15:11:02,922 INFO BlockStateChange (CorruptReplicasMap.java:addToCorruptReplicasMap(76)) - BLOCK NameSystem.addToCorruptReplicasMap: blk_1073741827 added as corrupt on 127.0.0.1:46044 by localhost/127.0.0.1 because block is COMPLETE and reported genstamp 1003 does not match genstamp in block map 1004 2015-03-18 15:11:02,922 INFO BlockStateChange (BlockManager.java:invalidateBlock(1215)) - BLOCK* invalidateBlock: blk_1073741827_1003(stored=blk_1073741827_1004) on 127.0.0.1:46044 2015-03-18 15:11:02,922 INFO BlockStateChange (BlockManager.java:invalidateBlock(1225)) - BLOCK* invalidateBlocks: postponing invalidation of blk_1073741827_1003(stored=blk_1073741827_1004) on 127.0.0.1:46044 because 1 replica(s) are located on nodes with potentially out-of-date block reports {code} If I add {{triggerBlockReports()}} before {{waitReplication()}} then the test passes, as it finally triggers deletion of the replica on the DN. I am fine fixing the test by adding {{triggerBlockReports()}} as above, but I don't know what is the reason for postponing replica deletion. Postponing should probably be avoided in this case, since the {{commitBlockSync()}} is as good as block report for the particular block. BTW, your change completely eliminates the failure of {{testTruncateWithDataNodesRestartImmediately()}} from HDFS-7886, which I ran without {{triggerBlockReports()}}. commitBlockSynchronization() does not remove locations -- Key: HDFS-7930 URL: https://issues.apache.org/jira/browse/HDFS-7930 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Priority: Blocker Attachments: HDFS-7930.001.patch, HDFS-7930.002.patch When {{commitBlockSynchronization()}} has less {{newTargets}} than in the original block it does not remove unconfirmed locations. This results in that the the block stores locations of different lengths or genStamp (corrupt). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files
[ https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368051#comment-14368051 ] Jing Zhao commented on HDFS-7853: - Yeah, you're right. We need to either provide another method to traverse the storages so that we can get the index by free, or using sentinels as optimization. Please feel free to create a jira for this optimization. Erasure coding: extend LocatedBlocks to support reading from striped files -- Key: HDFS-7853 URL: https://issues.apache.org/jira/browse/HDFS-7853 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Zhe Zhang Assignee: Jing Zhao Fix For: HDFS-7285 Attachments: HDFS-7853.000.patch, HDFS-7853.001.patch, HDFS-7853.002.patch We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work with striping layout (possibly an extra list specifying the index of each location in the group) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5523) Support multiple subdirectory exports in HDFS NFS gateway
[ https://issues.apache.org/jira/browse/HDFS-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368142#comment-14368142 ] Zhe Zhang commented on HDFS-5523: - I think we can disallow configuring overlapped exports? Support multiple subdirectory exports in HDFS NFS gateway -- Key: HDFS-5523 URL: https://issues.apache.org/jira/browse/HDFS-5523 Project: Hadoop HDFS Issue Type: New Feature Components: nfs Reporter: Brandon Li Currently, the HDFS NFS Gateway only supports configuring a single subdirectory export via the {{dfs.nfs3.export.point}} configuration setting. Supporting multiple subdirectory exports can make data and security management easier when using the HDFS NFS Gateway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7713) Improve the HDFS Web UI browser to allow creating dirs
[ https://issues.apache.org/jira/browse/HDFS-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368152#comment-14368152 ] Ravi Prakash commented on HDFS-7713: These test failures are most likely spurious because I never touched JAVA code in this patch. Improve the HDFS Web UI browser to allow creating dirs -- Key: HDFS-7713 URL: https://issues.apache.org/jira/browse/HDFS-7713 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-7713.01.patch, HDFS-7713.02.patch, HDFS-7713.03.patch, HDFS-7713.04.patch, HDFS-7713.05.patch, HDFS-7713.06.patch This sub-task JIRA is for improving the NN HTML5 UI to allow the user to create directories. It uses WebHDFS and adds to the great work done in HDFS-6252 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7945) The WebHdfs system on DN does not honor the length parameter
[ https://issues.apache.org/jira/browse/HDFS-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368210#comment-14368210 ] Hudson commented on HDFS-7945: -- FAILURE: Integrated in Hadoop-trunk-Commit #7365 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7365/]) HDFS-7945. The WebHdfs system on DN does not honor the length parameter. Contributed by Haohui Mai. (wheat9: rev 8c40e88d5de51a273f6ae5cd11c40f44248bbfcd) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/ParameterParser.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/LengthParam.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt The WebHdfs system on DN does not honor the length parameter Key: HDFS-7945 URL: https://issues.apache.org/jira/browse/HDFS-7945 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Priority: Blocker Fix For: 2.7.0 Attachments: HDFS-7945.000.patch, HDFS-7945.001.patch HDFS-7279 introduces a new WebHdfs server on the DN. The new server does not honor the length parameter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7929) inotify unable fetch pre-upgrade edit log segments once upgrade starts
[ https://issues.apache.org/jira/browse/HDFS-7929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368277#comment-14368277 ] Colin Patrick McCabe commented on HDFS-7929: +1. Thanks for adding the test. inotify unable fetch pre-upgrade edit log segments once upgrade starts -- Key: HDFS-7929 URL: https://issues.apache.org/jira/browse/HDFS-7929 Project: Hadoop HDFS Issue Type: Bug Reporter: Zhe Zhang Assignee: Zhe Zhang Attachments: HDFS-7929-000.patch, HDFS-7929-001.patch, HDFS-7929-002.patch, HDFS-7929-003.patch inotify is often used to periodically poll HDFS events. However, once an HDFS upgrade has started, edit logs are moved to /previous on the NN, which is not accessible. Moreover, once the upgrade is finalized /previous is currently lost forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)