[jira] [Assigned] (HDFS-15590) namenode fails to start when ordered snapshot deletion feature is disabled
[ https://issues.apache.org/jira/browse/HDFS-15590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee reassigned HDFS-15590: -- Assignee: Shashikant Banerjee > namenode fails to start when ordered snapshot deletion feature is disabled > -- > > Key: HDFS-15590 > URL: https://issues.apache.org/jira/browse/HDFS-15590 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 3.4.0 > > > {code:java} > 1. Enabled ordered deletion snapshot feature. > 2. Created snapshottable directory - /user/hrt_6/atrr_dir1 > 3. Created snapshots s0, s1, s2. > 4. Deleted snapshot s2 > 5. Delete snapshot s0, s1, s2 again > 6. Disable ordered deletion snapshot feature > 5. Restart Namenode > Failed to start namenode. > org.apache.hadoop.hdfs.protocol.SnapshotException: Cannot delete snapshot s2 > from path /user/hrt_6/atrr_dir2: the snapshot does not exist. > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:293) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:510) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:819) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:287) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:182) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:912) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:760) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:337) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1164) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:755) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:646) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:717) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:960) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:933) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1670) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1737) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15590) namenode fails to start when ordered snapshot deletion feature is disabled
Shashikant Banerjee created HDFS-15590: -- Summary: namenode fails to start when ordered snapshot deletion feature is disabled Key: HDFS-15590 URL: https://issues.apache.org/jira/browse/HDFS-15590 Project: Hadoop HDFS Issue Type: Sub-task Components: snapshots Reporter: Shashikant Banerjee Fix For: 3.4.0 {code:java} 1. Enabled ordered deletion snapshot feature. 2. Created snapshottable directory - /user/hrt_6/atrr_dir1 3. Created snapshots s0, s1, s2. 4. Deleted snapshot s2 5. Delete snapshot s0, s1, s2 again 6. Disable ordered deletion snapshot feature 5. Restart Namenode Failed to start namenode. org.apache.hadoop.hdfs.protocol.SnapshotException: Cannot delete snapshot s2 from path /user/hrt_6/atrr_dir2: the snapshot does not exist. at org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:237) at org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:293) at org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:510) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:819) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:287) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:182) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:912) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:760) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:337) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1164) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:755) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:646) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:717) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:960) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:933) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1670) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1737) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15456) TestExternalStoragePolicySatisfier fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199808#comment-17199808 ] Leon Gao commented on HDFS-15456: - [~ayushtkn] Yeah it fails in almost all of the recent PRs: [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2305/3/testReport/org.apache.hadoop.hdfs.server.sps/TestExternalStoragePolicySatisfier/testChooseInSameDatanodeWithONESSDShouldNotChooseIfNoSpace/] The "NODE_TOO_BUSY" line in the middle is truncated but you can see last block cannot be replicated that caused time-out, which is the same issue I see from my local: {code:java} 2020-09-16 13:03:22,323 [Listener at localhost/40885] INFO hdfs.DFSTestUtil (DFSTestUtil.java:get(2552)) - DISK replica count, expected=3 and actual=2{code} Few other examples: [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2298/3/testReport/org.apache.hadoop.hdfs.server.sps/TestExternalStoragePolicySatisfier/testChooseInSameDatanodeWithONESSDShouldNotChooseIfNoSpace/] [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2296/3/testReport/org.apache.hadoop.hdfs.server.sps/TestExternalStoragePolicySatisfier/testChooseInSameDatanodeWithONESSDShouldNotChooseIfNoSpace/] > TestExternalStoragePolicySatisfier fails intermittently > --- > > Key: HDFS-15456 > URL: https://issues.apache.org/jira/browse/HDFS-15456 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available, test > Time Spent: 1h > Remaining Estimate: 0h > > {{TestExternalStoragePolicySatisfier}} frequently times-out on hadoop trunk > {code:bash} > [ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 421.443 s <<< FAILURE! - in > org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier > [ERROR] > testChooseInSameDatanodeWithONESSDShouldNotChooseIfNoSpace(org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier) > Time elapsed: 43.983 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-07-07 07:51:10,267 > "IPC Server handler 4 on default port 44933" daemon prio=5 tid=1138 > timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) > at > org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:307) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2918) > "ForkJoinPool-2-worker-19" daemon prio=5 tid=235 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1824) > at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1693) > at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) > "refreshUsed-/home/jenkins/jenkins-slave/workspace/PreCommit-HADOOP-Build/sourcedir/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1/current/BP-912129709-172.17.0.2-1594151429636" > daemon prio=5 tid=1217 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:205) > at java.lang.Thread.run(Thread.java:748) > "Socket Reader #1 for port 0" daemon prio=5 tid=1192 runnable > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) > at > org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1273) > at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1252) > "pool-90-thread-1" prio=5 tid=1069 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at >
[jira] [Commented] (HDFS-15456) TestExternalStoragePolicySatisfier fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199790#comment-17199790 ] Ayush Saxena commented on HDFS-15456: - I tried this up but couldn't reproduce this at my local. Do you see this trace, in any of the builds here for any pre-commit job where the test failed? > TestExternalStoragePolicySatisfier fails intermittently > --- > > Key: HDFS-15456 > URL: https://issues.apache.org/jira/browse/HDFS-15456 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available, test > Time Spent: 1h > Remaining Estimate: 0h > > {{TestExternalStoragePolicySatisfier}} frequently times-out on hadoop trunk > {code:bash} > [ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 421.443 s <<< FAILURE! - in > org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier > [ERROR] > testChooseInSameDatanodeWithONESSDShouldNotChooseIfNoSpace(org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier) > Time elapsed: 43.983 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-07-07 07:51:10,267 > "IPC Server handler 4 on default port 44933" daemon prio=5 tid=1138 > timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) > at > org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:307) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2918) > "ForkJoinPool-2-worker-19" daemon prio=5 tid=235 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1824) > at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1693) > at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) > "refreshUsed-/home/jenkins/jenkins-slave/workspace/PreCommit-HADOOP-Build/sourcedir/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1/current/BP-912129709-172.17.0.2-1594151429636" > daemon prio=5 tid=1217 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:205) > at java.lang.Thread.run(Thread.java:748) > "Socket Reader #1 for port 0" daemon prio=5 tid=1192 runnable > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) > at > org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1273) > at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1252) > "pool-90-thread-1" prio=5 tid=1069 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "IPC Server handler 2 on default port 37995" daemon prio=5 tid=1169 > timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at >
[jira] [Work logged] (HDFS-15554) RBF: force router check file existence in destinations before adding/updating mount points
[ https://issues.apache.org/jira/browse/HDFS-15554?focusedWorklogId=487847=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487847 ] ASF GitHub Bot logged work on HDFS-15554: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:39 Start Date: 22/Sep/20 03:39 Worklog Time Spent: 10m Work Description: ayushtkn merged pull request #2266: URL: https://github.com/apache/hadoop/pull/2266 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487847) Time Spent: 3h 50m (was: 3h 40m) > RBF: force router check file existence in destinations before adding/updating > mount points > -- > > Key: HDFS-15554 > URL: https://issues.apache.org/jira/browse/HDFS-15554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Adding/Updating mount points right now is only a router action without > validation in the downstream namenodes for the destination files/directories. > In practice we have set up the dangling mount points and when clients call > listStatus they would get the file returned, but then if they try to access > the file FileNotFoundException would be thrown out. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15589) Huge PostponedMisreplicatedBlocks can't decrease immediately when start namenode after datanode
[ https://issues.apache.org/jira/browse/HDFS-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199786#comment-17199786 ] Ayush Saxena commented on HDFS-15589: - Yeps, that is true. For this only there was a proposal earlier, that block reports can be triggered after failover, but that could't reach conclusion, Since the number of datanodes in actual production will be quite high, and it could increase load on Namenode. If you are facing this trouble, you can trigger Block report explicitly using {{dfsadmin}} or do you propose any solution to this? > Huge PostponedMisreplicatedBlocks can't decrease immediately when start > namenode after datanode > --- > > Key: HDFS-15589 > URL: https://issues.apache.org/jira/browse/HDFS-15589 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Environment: CentOS 7 >Reporter: zhengchenyu >Priority: Major > > In our test cluster, I restart my namenode. Then I found many > PostponedMisreplicatedBlocks which doesn't decrease immediately. > I search the log below like this. > {code:java} > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > {code} > Node: test cluster only have 6 datanode. > You will see the blockreport called before "Marking all datanodes as stale" > which is logged by startActiveServices. But > DatanodeStorageInfo.blockContentsStale only set to false in blockreport, then > startActiveServices set all datnaode to stale node. So the datanodes will > keep stale util next blockreport, then PostponedMisreplicatedBlocks keep a > huge number. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15557) Log the reason why a storage log file can't be deleted
[ https://issues.apache.org/jira/browse/HDFS-15557?focusedWorklogId=487737=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487737 ] ASF GitHub Bot logged work on HDFS-15557: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:31 Start Date: 22/Sep/20 03:31 Worklog Time Spent: 10m Work Description: NickyYe commented on a change in pull request #2274: URL: https://github.com/apache/hadoop/pull/2274#discussion_r491797152 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/AtomicFileOutputStream.java ## @@ -75,8 +76,13 @@ public void close() throws IOException { boolean renamed = tmpFile.renameTo(origFile); if (!renamed) { // On windows, renameTo does not replace. - if (origFile.exists() && !origFile.delete()) { -throw new IOException("Could not delete original file " + origFile); + if (origFile.exists()) { +try { + Files.delete(origFile.toPath()); +} catch (IOException e) { + throw new IOException("Could not delete original file " + origFile Review comment: Fixed. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487737) Time Spent: 1h 50m (was: 1h 40m) > Log the reason why a storage log file can't be deleted > -- > > Key: HDFS-15557 > URL: https://issues.apache.org/jira/browse/HDFS-15557 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ye Ni >Assignee: Ye Ni >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Before > > {code:java} > 2020-09-02 06:48:31,983 WARN [IPC Server handler 206 on 8020] > org.apache.hadoop.hdfs.server.common.Storage: writeTransactionIdToStorage > failed on Storage Directory root= K:\data\hdfs\namenode; location= null; > type= IMAGE; isShared= false; lock= > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]; storageUuid= > null java.io.IOException: Could not delete original file > K:\data\hdfs\namenode\current\seen_txid{code} > > After > > {code:java} > 2020-09-02 17:43:29,421 WARN [IPC Server handler 111 on 8020] > org.apache.hadoop.hdfs.server.common.Storage: writeTransactionIdToStorage > failed on Storage Directory root= K:\data\hdfs\namenode; location= null; > type= IMAGE; isShared= false; lock= > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]; storageUuid= > null java.io.IOException: Could not delete original file > K:\data\hdfs\namenode\current\seen_txid due to failure: > java.nio.file.FileSystemException: K:\data\hdfs\namenode\current\seen_txid: > The process cannot access the file because it is being used by another > process.{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15557) Log the reason why a storage log file can't be deleted
[ https://issues.apache.org/jira/browse/HDFS-15557?focusedWorklogId=487626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487626 ] ASF GitHub Bot logged work on HDFS-15557: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:22 Start Date: 22/Sep/20 03:22 Worklog Time Spent: 10m Work Description: goiri commented on pull request #2274: URL: https://github.com/apache/hadoop/pull/2274#issuecomment-696264939 Not sure why the build came out so badly... let's see if we can retrigger. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487626) Time Spent: 1h 40m (was: 1.5h) > Log the reason why a storage log file can't be deleted > -- > > Key: HDFS-15557 > URL: https://issues.apache.org/jira/browse/HDFS-15557 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ye Ni >Assignee: Ye Ni >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Before > > {code:java} > 2020-09-02 06:48:31,983 WARN [IPC Server handler 206 on 8020] > org.apache.hadoop.hdfs.server.common.Storage: writeTransactionIdToStorage > failed on Storage Directory root= K:\data\hdfs\namenode; location= null; > type= IMAGE; isShared= false; lock= > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]; storageUuid= > null java.io.IOException: Could not delete original file > K:\data\hdfs\namenode\current\seen_txid{code} > > After > > {code:java} > 2020-09-02 17:43:29,421 WARN [IPC Server handler 111 on 8020] > org.apache.hadoop.hdfs.server.common.Storage: writeTransactionIdToStorage > failed on Storage Directory root= K:\data\hdfs\namenode; location= null; > type= IMAGE; isShared= false; lock= > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]; storageUuid= > null java.io.IOException: Could not delete original file > K:\data\hdfs\namenode\current\seen_txid due to failure: > java.nio.file.FileSystemException: K:\data\hdfs\namenode\current\seen_txid: > The process cannot access the file because it is being used by another > process.{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15548) Allow configuring DISK/ARCHIVE storage types on same device mount
[ https://issues.apache.org/jira/browse/HDFS-15548?focusedWorklogId=487831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487831 ] ASF GitHub Bot logged work on HDFS-15548: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:38 Start Date: 22/Sep/20 03:38 Worklog Time Spent: 10m Work Description: LeonGao91 commented on pull request #2288: URL: https://github.com/apache/hadoop/pull/2288#issuecomment-696393867 @Hexiaoqiao Would you please take a second look? I have added a check as we discussed with UT. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487831) Time Spent: 3h 40m (was: 3.5h) > Allow configuring DISK/ARCHIVE storage types on same device mount > - > > Key: HDFS-15548 > URL: https://issues.apache.org/jira/browse/HDFS-15548 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Leon Gao >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > We can allow configuring DISK/ARCHIVE storage types on the same device mount > on two separate directories. > Users should be able to configure the capacity for each. Also, the datanode > usage report should report stats correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS
[ https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=487562=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487562 ] ASF GitHub Bot logged work on HDFS-15025: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:16 Start Date: 22/Sep/20 03:16 Worklog Time Spent: 10m Work Description: liuml07 commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r491857797 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockStatsMXBean.java ## @@ -145,9 +150,11 @@ public void testStorageTypeStatsJMX() throws Exception { Map storageTypeStats = (Map)entry.get("value"); typesPresent.add(storageType); if (storageType.equals("ARCHIVE") || storageType.equals("DISK") ) { -assertEquals(3l, storageTypeStats.get("nodesInService")); +assertEquals(3L, storageTypeStats.get("nodesInService")); Review comment: I have not used Java 7 for a while, but I remember vaguely this is actually supported? https://docs.oracle.com/javase/specs/jls/se7/html/jls-14.html ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockStatsMXBean.java ## @@ -145,9 +150,11 @@ public void testStorageTypeStatsJMX() throws Exception { Map storageTypeStats = (Map)entry.get("value"); typesPresent.add(storageType); if (storageType.equals("ARCHIVE") || storageType.equals("DISK") ) { -assertEquals(3l, storageTypeStats.get("nodesInService")); +assertEquals(3L, storageTypeStats.get("nodesInService")); Review comment: Hadoop releases before 2.10 are all end of life (EoL). Hadoop 2.10 is the only version using Java 7. We do not need any support, compile or runtime, for Java versions before Java 7. Hadoop 3.x are all using Java 8+. We do not need any Java 7 support in Hadoop 3. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487562) Time Spent: 7h 20m (was: 7h 10m) > Applying NVDIMM storage media to HDFS > - > > Key: HDFS-15025 > URL: https://issues.apache.org/jira/browse/HDFS-15025 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: YaYun Wang >Assignee: YaYun Wang >Priority: Major > Labels: pull-request-available > Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, > HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, > HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch > > Time Spent: 7h 20m > Remaining Estimate: 0h > > The non-volatile memory NVDIMM is faster than SSD, it can be used > simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on > NVDIMM can not only improves the response rate of HDFS, but also ensure the > reliability of the data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15516) Add info for create flags in NameNode audit logs
[ https://issues.apache.org/jira/browse/HDFS-15516?focusedWorklogId=487644=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487644 ] ASF GitHub Bot logged work on HDFS-15516: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:23 Start Date: 22/Sep/20 03:23 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2281: URL: https://github.com/apache/hadoop/pull/2281#issuecomment-695807876 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 12s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 14s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 28m 13s | trunk passed | | +1 :green_heart: | compile | 20m 59s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 17m 38s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 54s | trunk passed | | +1 :green_heart: | mvnsite | 1m 59s | trunk passed | | +1 :green_heart: | shadedclient | 20m 57s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 27s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 58s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 43s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 57s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 24s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 32s | the patch passed | | +1 :green_heart: | compile | 20m 13s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 20m 13s | the patch passed | | +1 :green_heart: | compile | 17m 42s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 17m 42s | the patch passed | | +1 :green_heart: | checkstyle | 2m 53s | the patch passed | | +1 :green_heart: | mvnsite | 1m 55s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 15m 47s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 27s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 59s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 4m 13s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 109m 0s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 1m 20s | hadoop-dynamometer-workload in the patch passed. | | +1 :green_heart: | asflicense | 0m 56s | The patch does not generate ASF License warnings. | | | | 284m 36s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.namenode.ha.TestHAAppend | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2281/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2281 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 5825ccc227ce 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 95dfc875d32 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions |
[jira] [Work logged] (HDFS-15554) RBF: force router check file existence in destinations before adding/updating mount points
[ https://issues.apache.org/jira/browse/HDFS-15554?focusedWorklogId=487919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487919 ] ASF GitHub Bot logged work on HDFS-15554: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:46 Start Date: 22/Sep/20 03:46 Worklog Time Spent: 10m Work Description: fengnanli commented on pull request #2266: URL: https://github.com/apache/hadoop/pull/2266#issuecomment-696267644 @ayushtkn Can you help commit the change? Thanks a lot! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487919) Time Spent: 4h (was: 3h 50m) > RBF: force router check file existence in destinations before adding/updating > mount points > -- > > Key: HDFS-15554 > URL: https://issues.apache.org/jira/browse/HDFS-15554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 4h > Remaining Estimate: 0h > > Adding/Updating mount points right now is only a router action without > validation in the downstream namenodes for the destination files/directories. > In practice we have set up the dangling mount points and when clients call > listStatus they would get the file returned, but then if they try to access > the file FileNotFoundException would be thrown out. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15589) Huge PostponedMisreplicatedBlocks can't decrease immediately when start namenode after datanode
[ https://issues.apache.org/jira/browse/HDFS-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199750#comment-17199750 ] zhengchenyu commented on HDFS-15589: [~ayushtkn] I know the postpone block's logical. I encounter a case, maybe a low probability case. Now we describe this logical simply: (1) When namenode transient from standby to active, namenode will label all DatanodeDescriptor be stale for aviod to delete some possible deleted block. (2) Then datanode blockreport to namenode, then set DatanodeDescriptor to not stale. Then some over replicate block could be delete. But if (2) happend before (1), the DatanodeDescriptor will keep stale util next blockreport, you know blockreport is low frequency rpc operaiton. So PostponedMisreplicatedBlocks will keep huge number for long time. > Huge PostponedMisreplicatedBlocks can't decrease immediately when start > namenode after datanode > --- > > Key: HDFS-15589 > URL: https://issues.apache.org/jira/browse/HDFS-15589 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Environment: CentOS 7 >Reporter: zhengchenyu >Priority: Major > > In our test cluster, I restart my namenode. Then I found many > PostponedMisreplicatedBlocks which doesn't decrease immediately. > I search the log below like this. > {code:java} > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > {code} > Node: test cluster only have 6 datanode. > You will see the blockreport called before "Marking all datanodes as stale" > which is logged by startActiveServices. But > DatanodeStorageInfo.blockContentsStale only set to false in blockreport, then > startActiveServices set all datnaode to stale node. So the datanodes will > keep stale util next blockreport, then PostponedMisreplicatedBlocks keep a > huge number. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15557) Log the reason why a storage log file can't be deleted
[ https://issues.apache.org/jira/browse/HDFS-15557?focusedWorklogId=487467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487467 ] ASF GitHub Bot logged work on HDFS-15557: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:04 Start Date: 22/Sep/20 03:04 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2274: URL: https://github.com/apache/hadoop/pull/2274#issuecomment-695921398 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 31m 11s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | -1 :x: | mvninstall | 16m 16s | root in trunk failed. | | -1 :x: | compile | 0m 17s | hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 0m 26s | hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | +1 :green_heart: | checkstyle | 0m 51s | trunk passed | | -1 :x: | mvnsite | 0m 25s | hadoop-hdfs in trunk failed. | | +1 :green_heart: | shadedclient | 1m 49s | branch has no errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 26s | hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javadoc | 0m 26s | hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | +0 :ok: | spotbugs | 3m 8s | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 :x: | findbugs | 0m 25s | hadoop-hdfs in trunk failed. | ||| _ Patch Compile Tests _ | | -1 :x: | mvninstall | 0m 22s | hadoop-hdfs in the patch failed. | | -1 :x: | compile | 0m 22s | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javac | 0m 22s | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 0m 21s | hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | javac | 0m 21s | hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -0 :warning: | checkstyle | 0m 20s | The patch fails to run checkstyle in hadoop-hdfs | | -1 :x: | mvnsite | 0m 23s | hadoop-hdfs in the patch failed. | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 0m 21s | patch has no errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 23s | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javadoc | 0m 23s | hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | findbugs | 0m 22s | hadoop-hdfs in the patch failed. | ||| _ Other Tests _ | | -1 :x: | unit | 0m 21s | hadoop-hdfs in the patch failed. | | +0 :ok: | asflicense | 0m 22s | ASF License check generated no output? | | | | 59m 1s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2274/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2274 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 62e70170050b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 7a6265ac425 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | mvninstall | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2274/4/artifact/out/branch-mvninstall-root.txt | | compile |
[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS
[ https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=487465=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487465 ] ASF GitHub Bot logged work on HDFS-15025: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:04 Start Date: 22/Sep/20 03:04 Worklog Time Spent: 10m Work Description: YaYun-Wang commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r491852471 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockStatsMXBean.java ## @@ -145,9 +150,11 @@ public void testStorageTypeStatsJMX() throws Exception { Map storageTypeStats = (Map)entry.get("value"); typesPresent.add(storageType); if (storageType.equals("ARCHIVE") || storageType.equals("DISK") ) { -assertEquals(3l, storageTypeStats.get("nodesInService")); +assertEquals(3L, storageTypeStats.get("nodesInService")); Review comment: `storageType` is a parameter of "java.lang.String" , and `switch()` does not support "java.lang.String" before java 1.7. So, will `if-else ` be more appropriate here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487465) Time Spent: 7h 10m (was: 7h) > Applying NVDIMM storage media to HDFS > - > > Key: HDFS-15025 > URL: https://issues.apache.org/jira/browse/HDFS-15025 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: YaYun Wang >Assignee: YaYun Wang >Priority: Major > Labels: pull-request-available > Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, > HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, > HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch > > Time Spent: 7h 10m > Remaining Estimate: 0h > > The non-volatile memory NVDIMM is faster than SSD, it can be used > simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on > NVDIMM can not only improves the response rate of HDFS, but also ensure the > reliability of the data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15548) Allow configuring DISK/ARCHIVE storage types on same device mount
[ https://issues.apache.org/jira/browse/HDFS-15548?focusedWorklogId=487441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487441 ] ASF GitHub Bot logged work on HDFS-15548: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:01 Start Date: 22/Sep/20 03:01 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2288: URL: https://github.com/apache/hadoop/pull/2288#issuecomment-695952357 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 13s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 4s | trunk passed | | +1 :green_heart: | compile | 1m 54s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 1m 33s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 1m 11s | trunk passed | | +1 :green_heart: | mvnsite | 1m 48s | trunk passed | | +1 :green_heart: | shadedclient | 20m 39s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 9s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 39s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 4m 17s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 4m 15s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 38s | the patch passed | | +1 :green_heart: | compile | 1m 39s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | javac | 1m 39s | hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 2 new + 602 unchanged - 0 fixed = 604 total (was 602) | | +1 :green_heart: | compile | 1m 26s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | javac | 1m 26s | hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 2 new + 586 unchanged - 0 fixed = 588 total (was 586) | | +1 :green_heart: | checkstyle | 1m 6s | the patch passed | | +1 :green_heart: | mvnsite | 1m 38s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 2s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 17m 18s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 3s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 38s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 4m 23s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 145m 33s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 13s | The patch does not generate ASF License warnings. | | | | 251m 57s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.server.namenode.ha.TestUpdateBlockTailing | | | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | Subsystem | Report/Notes |
[jira] [Commented] (HDFS-15582) Reduce NameNode audit log
[ https://issues.apache.org/jira/browse/HDFS-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199739#comment-17199739 ] Jinglun commented on HDFS-15582: Re-upload v02 as v03. > Reduce NameNode audit log > - > > Key: HDFS-15582 > URL: https://issues.apache.org/jira/browse/HDFS-15582 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15582.001.patch, HDFS-15582.002.patch, > HDFS-15582.003.patch > > > Reduce the empty fields in audit log. Add a switch to skip all the empty > fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15582) Reduce NameNode audit log
[ https://issues.apache.org/jira/browse/HDFS-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15582: --- Attachment: HDFS-15582.003.patch > Reduce NameNode audit log > - > > Key: HDFS-15582 > URL: https://issues.apache.org/jira/browse/HDFS-15582 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15582.001.patch, HDFS-15582.002.patch, > HDFS-15582.003.patch > > > Reduce the empty fields in audit log. Add a switch to skip all the empty > fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199698#comment-17199698 ] Hadoop QA commented on HDFS-15569: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 47s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 21s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 19s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 64 unchanged - 3 fixed = 64 total (was 67) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 33s{color} | {color:red}
[jira] [Commented] (HDFS-15583) Backport DirectoryScanner improvements HDFS-14476, HDFS-14751 and HDFS-15048 to branch 3.2 and 3.1
[ https://issues.apache.org/jira/browse/HDFS-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199694#comment-17199694 ] Wei-Chiu Chuang commented on HDFS-15583: +1 > Backport DirectoryScanner improvements HDFS-14476, HDFS-14751 and HDFS-15048 > to branch 3.2 and 3.1 > -- > > Key: HDFS-15583 > URL: https://issues.apache.org/jira/browse/HDFS-15583 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.2.0, 3.2.1 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-15583.branch-3.2.001.patch > > > HDFS-14476, HDFS-14751 and HDFS-15048 made some good improvements to the > datanode DirectoryScanner, but due to a large refactor on that class in > branch-3.3, they are not trivial to backport to earlier branches. > HDFS-14476 introduced the problem in HDFS-14751 and a findbugs warning, fixed > in HDFS-15048, so these 3 need to be backported together. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199693#comment-17199693 ] Wei-Chiu Chuang commented on HDFS-15415: +1. Sorry I reviewed but forgot to comment here. > Reduce locking in Datanode DirectoryScanner > --- > > Key: HDFS-15415 > URL: https://issues.apache.org/jira/browse/HDFS-15415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15415.001.patch, HDFS-15415.002.patch, > HDFS-15415.003.patch, HDFS-15415.004.patch, HDFS-15415.005.patch, > HDFS-15415.branch-3.3.001.patch > > > In HDFS-15406, we have a small change to greatly reduce the runtime and > locking time of the datanode DirectoryScanner. They may be room for further > improvement. > From the scan step, we have captured a snapshot of what is on disk. After > calling `dataset.getFinalizedBlocks(bpid);` we have taken a snapshot of in > memory. The two snapshots are never 100% in sync as things are always > changing as the disk is scanned. > We are only comparing finalized blocks, so they should not really change: > * If a block is deleted after our snapshot, our snapshot will not see it and > that is OK. > * A finalized block could be appended. If that happens both the genstamp and > length will change, but that should be handled by reconcile when it calls > `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being > appended after they have been scanned from disk, but before they have been > compared with memory. > My suspicion is that we can do all the comparison work outside of the lock > and checkAndUpdate() re-checks any differences later under the lock on a > block by block basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199691#comment-17199691 ] Stephen O'Donnell edited comment on HDFS-15415 at 9/21/20, 10:27 PM: - I will commit the branch-3.3 patch tomorrow if there are no objections. Its basically identical to the trunk patch. The conflict is caused by the line: {code} Trunk: try (AutoCloseableLock lock = dataset.acquireDatasetReadLock()) { branch-3.3: try (AutoCloseableLock lock = dataset.acquireDatasetLock()) { {code} The change simply removes the lock from the entire block in the same way as the trunk patch. Waiting for HDFS-15583 to get committed before this can be backported to branch-3.2. was (Author: sodonnell): I will commit the branch-3.3 patch tomorrow if there are no objections. Its basically identical to the trunk patch. The conflict is caused by the line: {code} Trunk: try (AutoCloseableLock lock = dataset.acquireDatasetReadLock()) { branch-3.3: try (AutoCloseableLock lock = dataset.acquireDatasetLock()) { {code} The change simply removes the lock from the entire block. Waiting for HDFS-15583 to get committed before this can be backported to branch-3.2. > Reduce locking in Datanode DirectoryScanner > --- > > Key: HDFS-15415 > URL: https://issues.apache.org/jira/browse/HDFS-15415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15415.001.patch, HDFS-15415.002.patch, > HDFS-15415.003.patch, HDFS-15415.004.patch, HDFS-15415.005.patch, > HDFS-15415.branch-3.3.001.patch > > > In HDFS-15406, we have a small change to greatly reduce the runtime and > locking time of the datanode DirectoryScanner. They may be room for further > improvement. > From the scan step, we have captured a snapshot of what is on disk. After > calling `dataset.getFinalizedBlocks(bpid);` we have taken a snapshot of in > memory. The two snapshots are never 100% in sync as things are always > changing as the disk is scanned. > We are only comparing finalized blocks, so they should not really change: > * If a block is deleted after our snapshot, our snapshot will not see it and > that is OK. > * A finalized block could be appended. If that happens both the genstamp and > length will change, but that should be handled by reconcile when it calls > `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being > appended after they have been scanned from disk, but before they have been > compared with memory. > My suspicion is that we can do all the comparison work outside of the lock > and checkAndUpdate() re-checks any differences later under the lock on a > block by block basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15415) Reduce locking in Datanode DirectoryScanner
[ https://issues.apache.org/jira/browse/HDFS-15415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199691#comment-17199691 ] Stephen O'Donnell commented on HDFS-15415: -- I will commit the branch-3.3 patch tomorrow if there are no objections. Its basically identical to the trunk patch. The conflict is caused by the line: {code} Trunk: try (AutoCloseableLock lock = dataset.acquireDatasetReadLock()) { branch-3.3: try (AutoCloseableLock lock = dataset.acquireDatasetLock()) { {code} The change simply removes the lock from the entire block. Waiting for HDFS-15583 to get committed before this can be backported to branch-3.2. > Reduce locking in Datanode DirectoryScanner > --- > > Key: HDFS-15415 > URL: https://issues.apache.org/jira/browse/HDFS-15415 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15415.001.patch, HDFS-15415.002.patch, > HDFS-15415.003.patch, HDFS-15415.004.patch, HDFS-15415.005.patch, > HDFS-15415.branch-3.3.001.patch > > > In HDFS-15406, we have a small change to greatly reduce the runtime and > locking time of the datanode DirectoryScanner. They may be room for further > improvement. > From the scan step, we have captured a snapshot of what is on disk. After > calling `dataset.getFinalizedBlocks(bpid);` we have taken a snapshot of in > memory. The two snapshots are never 100% in sync as things are always > changing as the disk is scanned. > We are only comparing finalized blocks, so they should not really change: > * If a block is deleted after our snapshot, our snapshot will not see it and > that is OK. > * A finalized block could be appended. If that happens both the genstamp and > length will change, but that should be handled by reconcile when it calls > `FSDatasetImpl.checkAndUpdate()`, and there is nothing stopping blocks being > appended after they have been scanned from disk, but before they have been > compared with memory. > My suspicion is that we can do all the comparison work outside of the lock > and checkAndUpdate() re-checks any differences later under the lock on a > block by block basis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15456) TestExternalStoragePolicySatisfier fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-15456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199685#comment-17199685 ] Leon Gao commented on HDFS-15456: - Hi [~ayushtkn] , Any other thoughts on this? > TestExternalStoragePolicySatisfier fails intermittently > --- > > Key: HDFS-15456 > URL: https://issues.apache.org/jira/browse/HDFS-15456 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available, test > Time Spent: 1h > Remaining Estimate: 0h > > {{TestExternalStoragePolicySatisfier}} frequently times-out on hadoop trunk > {code:bash} > [ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 421.443 s <<< FAILURE! - in > org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier > [ERROR] > testChooseInSameDatanodeWithONESSDShouldNotChooseIfNoSpace(org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier) > Time elapsed: 43.983 s <<< ERROR! > java.util.concurrent.TimeoutException: > Timed out waiting for condition. Thread diagnostics: > Timestamp: 2020-07-07 07:51:10,267 > "IPC Server handler 4 on default port 44933" daemon prio=5 tid=1138 > timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) > at > org.apache.hadoop.ipc.CallQueueManager.take(CallQueueManager.java:307) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2918) > "ForkJoinPool-2-worker-19" daemon prio=5 tid=235 in Object.wait() > java.lang.Thread.State: WAITING (on object monitor) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1824) > at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1693) > at > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) > "refreshUsed-/home/jenkins/jenkins-slave/workspace/PreCommit-HADOOP-Build/sourcedir/hadoop-hdfs-project/hadoop-hdfs/target/test/data/4/dfs/data/data1/current/BP-912129709-172.17.0.2-1594151429636" > daemon prio=5 tid=1217 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:205) > at java.lang.Thread.run(Thread.java:748) > "Socket Reader #1 for port 0" daemon prio=5 tid=1192 runnable > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101) > at > org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:1273) > at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:1252) > "pool-90-thread-1" prio=5 tid=1069 timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "IPC Server handler 2 on default port 37995" daemon prio=5 tid=1169 > timed_waiting > java.lang.Thread.State: TIMED_WAITING > at sun.misc.Unsafe.park(Native Method) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > at >
[jira] [Work logged] (HDFS-15548) Allow configuring DISK/ARCHIVE storage types on same device mount
[ https://issues.apache.org/jira/browse/HDFS-15548?focusedWorklogId=487294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487294 ] ASF GitHub Bot logged work on HDFS-15548: - Author: ASF GitHub Bot Created on: 21/Sep/20 21:42 Start Date: 21/Sep/20 21:42 Worklog Time Spent: 10m Work Description: LeonGao91 commented on pull request #2288: URL: https://github.com/apache/hadoop/pull/2288#issuecomment-696393867 @Hexiaoqiao Would you please take a second look? I have added a check as we discussed with UT. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487294) Time Spent: 3h 20m (was: 3h 10m) > Allow configuring DISK/ARCHIVE storage types on same device mount > - > > Key: HDFS-15548 > URL: https://issues.apache.org/jira/browse/HDFS-15548 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Leon Gao >Assignee: Leon Gao >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > We can allow configuring DISK/ARCHIVE storage types on the same device mount > on two separate directories. > Users should be able to configure the capacity for each. Also, the datanode > usage report should report stats correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199669#comment-17199669 ] Hadoop QA commented on HDFS-15543: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 16s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 41s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 19s{color} | {color:red} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} |
[jira] [Commented] (HDFS-15554) RBF: force router check file existence in destinations before adding/updating mount points
[ https://issues.apache.org/jira/browse/HDFS-15554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199561#comment-17199561 ] Fengnan Li commented on HDFS-15554: --- Thanks [~ayushtkn] [~elgoiri] ! > RBF: force router check file existence in destinations before adding/updating > mount points > -- > > Key: HDFS-15554 > URL: https://issues.apache.org/jira/browse/HDFS-15554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Adding/Updating mount points right now is only a router action without > validation in the downstream namenodes for the destination files/directories. > In practice we have set up the dangling mount points and when clients call > listStatus they would get the file returned, but then if they try to access > the file FileNotFoundException would be thrown out. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15554) RBF: force router check file existence in destinations before adding/updating mount points
[ https://issues.apache.org/jira/browse/HDFS-15554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HDFS-15554: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Merged PR to trunk, Thanx [~fengnanli] for the contribution and [~elgoiri] for the review!!! > RBF: force router check file existence in destinations before adding/updating > mount points > -- > > Key: HDFS-15554 > URL: https://issues.apache.org/jira/browse/HDFS-15554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Adding/Updating mount points right now is only a router action without > validation in the downstream namenodes for the destination files/directories. > In practice we have set up the dangling mount points and when clients call > listStatus they would get the file returned, but then if they try to access > the file FileNotFoundException would be thrown out. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15554) RBF: force router check file existence in destinations before adding/updating mount points
[ https://issues.apache.org/jira/browse/HDFS-15554?focusedWorklogId=487133=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487133 ] ASF GitHub Bot logged work on HDFS-15554: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:54 Start Date: 21/Sep/20 17:54 Worklog Time Spent: 10m Work Description: ayushtkn merged pull request #2266: URL: https://github.com/apache/hadoop/pull/2266 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487133) Time Spent: 3h 40m (was: 3.5h) > RBF: force router check file existence in destinations before adding/updating > mount points > -- > > Key: HDFS-15554 > URL: https://issues.apache.org/jira/browse/HDFS-15554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > Adding/Updating mount points right now is only a router action without > validation in the downstream namenodes for the destination files/directories. > In practice we have set up the dangling mount points and when clients call > listStatus they would get the file returned, but then if they try to access > the file FileNotFoundException would be thrown out. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15554) RBF: force router check file existence in destinations before adding/updating mount points
[ https://issues.apache.org/jira/browse/HDFS-15554?focusedWorklogId=487123=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487123 ] ASF GitHub Bot logged work on HDFS-15554: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:45 Start Date: 21/Sep/20 17:45 Worklog Time Spent: 10m Work Description: fengnanli commented on pull request #2266: URL: https://github.com/apache/hadoop/pull/2266#issuecomment-696267644 @ayushtkn Can you help commit the change? Thanks a lot! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487123) Time Spent: 3.5h (was: 3h 20m) > RBF: force router check file existence in destinations before adding/updating > mount points > -- > > Key: HDFS-15554 > URL: https://issues.apache.org/jira/browse/HDFS-15554 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fengnan Li >Assignee: Fengnan Li >Priority: Minor > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > Adding/Updating mount points right now is only a router action without > validation in the downstream namenodes for the destination files/directories. > In practice we have set up the dangling mount points and when clients call > listStatus they would get the file returned, but then if they try to access > the file FileNotFoundException would be thrown out. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15557) Log the reason why a storage log file can't be deleted
[ https://issues.apache.org/jira/browse/HDFS-15557?focusedWorklogId=487119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487119 ] ASF GitHub Bot logged work on HDFS-15557: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:40 Start Date: 21/Sep/20 17:40 Worklog Time Spent: 10m Work Description: goiri commented on pull request #2274: URL: https://github.com/apache/hadoop/pull/2274#issuecomment-696264939 Not sure why the build came out so badly... let's see if we can retrigger. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487119) Time Spent: 1h 20m (was: 1h 10m) > Log the reason why a storage log file can't be deleted > -- > > Key: HDFS-15557 > URL: https://issues.apache.org/jira/browse/HDFS-15557 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ye Ni >Assignee: Ye Ni >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Before > > {code:java} > 2020-09-02 06:48:31,983 WARN [IPC Server handler 206 on 8020] > org.apache.hadoop.hdfs.server.common.Storage: writeTransactionIdToStorage > failed on Storage Directory root= K:\data\hdfs\namenode; location= null; > type= IMAGE; isShared= false; lock= > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]; storageUuid= > null java.io.IOException: Could not delete original file > K:\data\hdfs\namenode\current\seen_txid{code} > > After > > {code:java} > 2020-09-02 17:43:29,421 WARN [IPC Server handler 111 on 8020] > org.apache.hadoop.hdfs.server.common.Storage: writeTransactionIdToStorage > failed on Storage Directory root= K:\data\hdfs\namenode; location= null; > type= IMAGE; isShared= false; lock= > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]; storageUuid= > null java.io.IOException: Could not delete original file > K:\data\hdfs\namenode\current\seen_txid due to failure: > java.nio.file.FileSystemException: K:\data\hdfs\namenode\current\seen_txid: > The process cannot access the file because it is being used by another > process.{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15557) Log the reason why a storage log file can't be deleted
[ https://issues.apache.org/jira/browse/HDFS-15557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-15557: --- Status: Patch Available (was: Open) > Log the reason why a storage log file can't be deleted > -- > > Key: HDFS-15557 > URL: https://issues.apache.org/jira/browse/HDFS-15557 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ye Ni >Assignee: Ye Ni >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Before > > {code:java} > 2020-09-02 06:48:31,983 WARN [IPC Server handler 206 on 8020] > org.apache.hadoop.hdfs.server.common.Storage: writeTransactionIdToStorage > failed on Storage Directory root= K:\data\hdfs\namenode; location= null; > type= IMAGE; isShared= false; lock= > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]; storageUuid= > null java.io.IOException: Could not delete original file > K:\data\hdfs\namenode\current\seen_txid{code} > > After > > {code:java} > 2020-09-02 17:43:29,421 WARN [IPC Server handler 111 on 8020] > org.apache.hadoop.hdfs.server.common.Storage: writeTransactionIdToStorage > failed on Storage Directory root= K:\data\hdfs\namenode; location= null; > type= IMAGE; isShared= false; lock= > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]; storageUuid= > null java.io.IOException: Could not delete original file > K:\data\hdfs\namenode\current\seen_txid due to failure: > java.nio.file.FileSystemException: K:\data\hdfs\namenode\current\seen_txid: > The process cannot access the file because it is being used by another > process.{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Boyina updated HDFS-15569: -- Attachment: HDFS-15569.003.patch > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15569) Speed up the Storage#doRecover during datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-15569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199503#comment-17199503 ] Hemanth Boyina commented on HDFS-15569: --- thanks for the review [~weichiu] {quote}and why log rootPath instead of the curTmp? {quote} for any upgrade failure , in Storage#doRecover ,we are having log info with root path , so just to be identical have kept the same kind of log reason {code:java} case RECOVER_ROLLBACK: // mv removed.tmp -> current LOG.info("Recovering storage directory {} from previous rollback", rootPath); {code} though your point makes sense to me , have updated the patch fixing your comments , please review > Speed up the Storage#doRecover during datanode rolling upgrade > --- > > Key: HDFS-15569 > URL: https://issues.apache.org/jira/browse/HDFS-15569 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Hemanth Boyina >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15569.001.patch, HDFS-15569.002.patch, > HDFS-15569.003.patch > > > When upgrading datanode from hadoop 2.7.2 to 3.1.1 , because of jvm not > having enough memory upgrade failed , Adjusted memory configurations and re > upgraded datanode , > Now datanode upgrade has taken more time , on analyzing found that > Storage#deleteDir has taken more time in RECOVER_UPGRADE state > {code:java} > "Thread-28" #270 daemon prio=5 os_prio=0 tid=0x7fed5a9b8000 nid=0x2b5c > runnable [0x7fdcdad2a000]"Thread-28" #270 daemon prio=5 os_prio=0 > tid=0x7fed5a9b8000 nid=0x2b5c runnable [0x7fdcdad2a000] > java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.delete0(Native > Method) at java.io.UnixFileSystem.delete(UnixFileSystem.java:265) at > java.io.File.delete(File.java:1041) at > org.apache.hadoop.fs.FileUtil.deleteImpl(FileUtil.java:229) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:270) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:285) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:182) at > org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:153) at > org.apache.hadoop.hdfs.server.common.Storage.deleteDir(Storage.java:1348) at > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.doRecover(Storage.java:782) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:174) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:224) > at > org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:253) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:455) > at > org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:389) > - locked <0x7fdf08ec7548> (a > org.apache.hadoop.hdfs.server.datanode.DataStorage) at > org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:557) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1761) > - locked <0x7fdf08ec7598> (a > org.apache.hadoop.hdfs.server.datanode.DataNode) at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1697) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:392) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822) > at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199501#comment-17199501 ] Hemanth Boyina commented on HDFS-15543: --- thanks for the review [~elgoiri] {quote}The failed tests look suspicious {quote} RouterClientProtocol#isUnavailableSubclusterException does recursive call with ioException cause , modifying this method to void have caused to throw exception cause but not actual exception have updated the patch , please review > RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount > points with fault Tolerance enabled. > > > Key: HDFS-15543 > URL: https://issues.apache.org/jira/browse/HDFS-15543 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15543.001.patch, HDFS-15543.002.patch, > HDFS-15543.003.patch, HDFS-15543.004.patch, HDFS-15543.005.patch, > HDFS-15543_testrepro.patch > > > A RANDOM mount point should allow to creating new files if one subcluster is > down also with Fault Tolerance was enabled. but here it's failed. > MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec > *Mount Table Entries:* > Source Destinations Owner Group Mode Quota/Usage > /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x > [NsQuota: -/-, SsQuota: -/-] > *File Write throne the Exception:-* > 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 > 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode > DatanodeInfoWithStorage[DISK] > 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) > 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block > locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - > Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15543) RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount points with fault Tolerance enabled.
[ https://issues.apache.org/jira/browse/HDFS-15543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hemanth Boyina updated HDFS-15543: -- Attachment: HDFS-15543.005.patch > RBF: Write Should allow, when a subcluster is unavailable for RANDOM mount > points with fault Tolerance enabled. > > > Key: HDFS-15543 > URL: https://issues.apache.org/jira/browse/HDFS-15543 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.1.1 >Reporter: Harshakiran Reddy >Assignee: Hemanth Boyina >Priority: Major > Attachments: HDFS-15543.001.patch, HDFS-15543.002.patch, > HDFS-15543.003.patch, HDFS-15543.004.patch, HDFS-15543.005.patch, > HDFS-15543_testrepro.patch > > > A RANDOM mount point should allow to creating new files if one subcluster is > down also with Fault Tolerance was enabled. but here it's failed. > MultiDestination_client]# hdfs dfsrouteradmin -ls /test_ec > *Mount Table Entries:* > Source Destinations Owner Group Mode Quota/Usage > /test_ec *hacluster->/tes_ec,hacluster1->/tes_ec* test ficommon rwxr-xr-x > [NsQuota: -/-, SsQuota: -/-] > *File Write throne the Exception:-* > 2020-08-26 19:13:21,839 WARN hdfs.DataStreamer: Abandoning blk_1073743375_2551 > 2020-08-26 19:13:21,877 WARN hdfs.DataStreamer: Excluding datanode > DatanodeInfoWithStorage[DISK] > 2020-08-26 19:13:21,878 WARN hdfs.DataStreamer: DataStreamer Exception > java.io.IOException: Unable to create new block. > at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1758) > at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:718) > 2020-08-26 19:13:21,879 WARN hdfs.DataStreamer: Could not get block > locations. Source file "/test_ec/f1._COPYING_" - Aborting...block==null > put: Could not get block locations. Source file "/test_ec/f1._COPYING_" - > Aborting...block==null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15589) Huge PostponedMisreplicatedBlocks can't decrease immediately when start namenode after datanode
[ https://issues.apache.org/jira/browse/HDFS-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199432#comment-17199432 ] Ayush Saxena commented on HDFS-15589: - Do you mean to say that the blocks doesn't get deleted post a namenode restart/failover, then yes, it won't happen until the BR is received post DN's are marked stale. That is by design, the datanodes are marked stale, after the NN takes the active state, so as to prevent deletion of blocks, They are unmarked post the BR is received. > Huge PostponedMisreplicatedBlocks can't decrease immediately when start > namenode after datanode > --- > > Key: HDFS-15589 > URL: https://issues.apache.org/jira/browse/HDFS-15589 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Environment: CentOS 7 >Reporter: zhengchenyu >Priority: Major > > In our test cluster, I restart my namenode. Then I found many > PostponedMisreplicatedBlocks which doesn't decrease immediately. > I search the log below like this. > {code:java} > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: > from DatanodeRegistration(xx.xx.xx.xx:9866, > datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, > infoSecurePort=0, ipcPort=9867, > storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), > reports.length=12 > {code} > Node: test cluster only have 6 datanode. > You will see the blockreport called before "Marking all datanodes as stale" > which is logged by startActiveServices. But > DatanodeStorageInfo.blockContentsStale only set to false in blockreport, then > startActiveServices set all datnaode to stale node. So the datanodes will > keep stale util next blockreport, then PostponedMisreplicatedBlocks keep a > huge number. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15589) Huge PostponedMisreplicatedBlocks can't decrease immediately when start namenode after datanode
[ https://issues.apache.org/jira/browse/HDFS-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated HDFS-15589: --- Description: In our test cluster, I restart my namenode. Then I found many PostponedMisreplicatedBlocks which doesn't decrease immediately. I search the log below like this. {code:java} 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 {code} Node: test cluster only have 6 datanode. You will see the blockreport called before "Marking all datanodes as stale" which is logged by startActiveServices. But DatanodeStorageInfo.blockContentsStale only set to false in blockreport, then startActiveServices set all datnaode to stale node. So the datanodes will keep stale util next blockreport, then PostponedMisreplicatedBlocks keep a huge number. was: In our test cluster, I restart my namenode. Then I found many PostponedMisreplicatedBlocks which doesn't decrease immediately. I search the log below like this. {code:java} 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, infoSecurePort=0, ipcPort=9867,
[jira] [Updated] (HDFS-15589) Huge PostponedMisreplicatedBlocks can't decrease immediately when start namenode after datanode
[ https://issues.apache.org/jira/browse/HDFS-15589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated HDFS-15589: --- Description: In our test cluster, I restart my namenode. Then I found many PostponedMisreplicatedBlocks which doesn't decrease immediately. I search the log below like this. {code:java} 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 {code} Node: test cluster only have 6 datanode. You will see the blockreport called before "Marking all datanodes as stale" which is logged by startActiveServices. But DatanodeStorageInfo.blockContentsStale only set to false in blockreport, then startActiveServices set all datnaode to stale node. So the datanodes will keep stale util next blockreport. was: In our test cluster, I restart my namenode. Then I found many PostponedMisreplicatedBlocks which doesn't decrease immediately. I search the log below like this. {code} 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, infoSecurePort=0, ipcPort=9867,
[jira] [Created] (HDFS-15589) Huge PostponedMisreplicatedBlocks can't decrease immediately when start namenode after datanode
zhengchenyu created HDFS-15589: -- Summary: Huge PostponedMisreplicatedBlocks can't decrease immediately when start namenode after datanode Key: HDFS-15589 URL: https://issues.apache.org/jira/browse/HDFS-15589 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Environment: CentOS 7 Reporter: zhengchenyu In our test cluster, I restart my namenode. Then I found many PostponedMisreplicatedBlocks which doesn't decrease immediately. I search the log below like this. {code} 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=c6a9934f-afd4-4437-b976-fed55173ce57, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=aee144f1-2082-4bca-a92b-f3c154a71c65, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,029 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=d152fa5b-1089-4bfc-b9c4-e3a7d98c7a7b, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,156 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=5cffc1fe-ace9-4af8-adfc-6002a7f5565d, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,161 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=9980d8e1-b0d9-4657-b97d-c803f82c1459, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 2020-09-21 17:02:37,197 DEBUG BlockStateChange: *BLOCK* NameNode.blockReport: from DatanodeRegistration(xx.xx.xx.xx:9866, datanodeUuid=77ff3f5e-37f0-405f-a16c-166311546cae, infoPort=9864, infoSecurePort=0, ipcPort=9867, storageInfo=lv=-57;cid=CID-9f6d0a32-e51c-459a-9f65-6e7b5791ee25;nsid=1016509846;c=1592578350834), reports.length=12 {code} Node: test cluster only have 6 datanode. You will see the blockreport called before "Marking all datanodes as stale" which is logged by startActiveServices. But DatanodeStorageInfo.blockContentsStale only set to false in blockreport, then startActiveServices set all datnaode to stale node. So the datanodes will keep stale util next blockreport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15582) Reduce NameNode audit log
[ https://issues.apache.org/jira/browse/HDFS-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199356#comment-17199356 ] Hadoop QA commented on HDFS-15582: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 3m 3s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s{color} | {color:red} root in trunk failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 23s{color} | {color:red} hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 25s{color} | {color:red} hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 23s{color} | {color:orange} The patch fails to run checkstyle in hadoop-hdfs {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 27s{color} | {color:red} hadoop-hdfs in trunk failed. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 1m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 25s{color} | {color:red} hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 23s{color} | {color:red} hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 35s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 28s{color} | {color:red} hadoop-hdfs in trunk failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 24s{color} | {color:red} hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 24s{color} | {color:red} hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 25s{color} | {color:red} hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 25s{color} | {color:red} hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 23s{color} | {color:orange} The patch fails to run checkstyle in hadoop-hdfs {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 23s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 0m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 24s{color} | {color:red} hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 24s{color} | {color:red} hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || |
[jira] [Commented] (HDFS-15442) Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative value
[ https://issues.apache.org/jira/browse/HDFS-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199350#comment-17199350 ] AMC-team commented on HDFS-15442: - [~ayushtkn] Thanks for reminding. Upload the patch again. > Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative > value > --- > > Key: HDFS-15442 > URL: https://issues.apache.org/jira/browse/HDFS-15442 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: AMC-team >Assignee: AMC-team >Priority: Major > Attachments: HDFS-15442.000.patch > > > In current implementation of checkpoint image transfer, if the file length is > bigger than the configured value dfs.image.transfer.chunksize, it will use > chunked streaming mode to avoid internal buffering. This mode should be used > only if more than chunkSize data is present to upload, otherwise upload may > not happen sometimes. > {code:java} > //TransferFsImage.java > int chunkSize = (int) conf.getLongBytes( > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_KEY, > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_DEFAULT); > if (imageFile.length() > chunkSize) { > // using chunked streaming mode to support upload of 2GB+ files and to > // avoid internal buffering. > // this mode should be used only if more than chunkSize data is present > // to upload. otherwise upload may not happen sometimes. > connection.setChunkedStreamingMode(chunkSize); > } > {code} > There is no check code for this parameter. User may accidentally set this > value to a wrong value. Here, if the user set chunkSize to a negative value. > Chunked streaming mode will always be used. In > setChunkedStreamingMode(chunkSize), there is a correction code that if the > chunkSize is <=0, it will be change to DEFAULT_CHUNK_SIZE. > {code:java} > public void setChunkedStreamingMode (int chunklen) { > if (connected) { > throw new IllegalStateException ("Can't set streaming mode: already > connected"); > } > if (fixedContentLength != -1 || fixedContentLengthLong != -1) { > throw new IllegalStateException ("Fixed length streaming mode set"); > } > chunkLength = chunklen <=0? DEFAULT_CHUNK_SIZE : chunklen; > } > {code} > However, > *If the user set dfs.image.transfer.chunksize to value that <= 0, even for > images whose imageFile.length() < DEFAULT_CHUNK_SIZE will use chunked > streaming mode and may fail the upload as mentioned above.* *(This scenario > may not be common, but* *we can prevent users setting this param to an > extremely small value.**)* > *How to fix:* > Add checking code or correction code right after parsing the config value > before really use the value (setChunkedStreamingMode). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15442) Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative value
[ https://issues.apache.org/jira/browse/HDFS-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199341#comment-17199341 ] Hadoop QA commented on HDFS-15442: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 27m 59s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 3s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 1s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 40s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 13 unchanged - 0 fixed = 15 total (was 13) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 58s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}101m 35s{color} |
[jira] [Commented] (HDFS-15582) Reduce NameNode audit log
[ https://issues.apache.org/jira/browse/HDFS-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199309#comment-17199309 ] Jinglun commented on HDFS-15582: Upload v02, fix checkstyle and whitespace. The failed tests run well on my computer. Seems not related. > Reduce NameNode audit log > - > > Key: HDFS-15582 > URL: https://issues.apache.org/jira/browse/HDFS-15582 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15582.001.patch, HDFS-15582.002.patch > > > Reduce the empty fields in audit log. Add a switch to skip all the empty > fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15582) Reduce NameNode audit log
[ https://issues.apache.org/jira/browse/HDFS-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-15582: --- Attachment: HDFS-15582.002.patch > Reduce NameNode audit log > - > > Key: HDFS-15582 > URL: https://issues.apache.org/jira/browse/HDFS-15582 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-15582.001.patch, HDFS-15582.002.patch > > > Reduce the empty fields in audit log. Add a switch to skip all the empty > fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15442) Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative value
[ https://issues.apache.org/jira/browse/HDFS-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199299#comment-17199299 ] Hadoop QA commented on HDFS-15442: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 5s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 38s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 2 new + 13 unchanged - 0 fixed = 15 total (was 13) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s{color} | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m 23s{color} |
[jira] [Updated] (HDFS-15442) Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative value
[ https://issues.apache.org/jira/browse/HDFS-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15442: Status: Open (was: Patch Available) > Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative > value > --- > > Key: HDFS-15442 > URL: https://issues.apache.org/jira/browse/HDFS-15442 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15442.000.patch > > > In current implementation of checkpoint image transfer, if the file length is > bigger than the configured value dfs.image.transfer.chunksize, it will use > chunked streaming mode to avoid internal buffering. This mode should be used > only if more than chunkSize data is present to upload, otherwise upload may > not happen sometimes. > {code:java} > //TransferFsImage.java > int chunkSize = (int) conf.getLongBytes( > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_KEY, > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_DEFAULT); > if (imageFile.length() > chunkSize) { > // using chunked streaming mode to support upload of 2GB+ files and to > // avoid internal buffering. > // this mode should be used only if more than chunkSize data is present > // to upload. otherwise upload may not happen sometimes. > connection.setChunkedStreamingMode(chunkSize); > } > {code} > There is no check code for this parameter. User may accidentally set this > value to a wrong value. Here, if the user set chunkSize to a negative value. > Chunked streaming mode will always be used. In > setChunkedStreamingMode(chunkSize), there is a correction code that if the > chunkSize is <=0, it will be change to DEFAULT_CHUNK_SIZE. > {code:java} > public void setChunkedStreamingMode (int chunklen) { > if (connected) { > throw new IllegalStateException ("Can't set streaming mode: already > connected"); > } > if (fixedContentLength != -1 || fixedContentLengthLong != -1) { > throw new IllegalStateException ("Fixed length streaming mode set"); > } > chunkLength = chunklen <=0? DEFAULT_CHUNK_SIZE : chunklen; > } > {code} > However, > *If the user set dfs.image.transfer.chunksize to value that <= 0, even for > images whose imageFile.length() < DEFAULT_CHUNK_SIZE will use chunked > streaming mode and may fail the upload as mentioned above.* *(This scenario > may not be common, but* *we can prevent users setting this param to an > extremely small value.**)* > *How to fix:* > Add checking code or correction code right after parsing the config value > before really use the value (setChunkedStreamingMode). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15442) Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative value
[ https://issues.apache.org/jira/browse/HDFS-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15442: Attachment: HDFS-15442.000.patch Assignee: AMC-team Status: Patch Available (was: Open) > Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative > value > --- > > Key: HDFS-15442 > URL: https://issues.apache.org/jira/browse/HDFS-15442 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: AMC-team >Assignee: AMC-team >Priority: Major > Attachments: HDFS-15442.000.patch > > > In current implementation of checkpoint image transfer, if the file length is > bigger than the configured value dfs.image.transfer.chunksize, it will use > chunked streaming mode to avoid internal buffering. This mode should be used > only if more than chunkSize data is present to upload, otherwise upload may > not happen sometimes. > {code:java} > //TransferFsImage.java > int chunkSize = (int) conf.getLongBytes( > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_KEY, > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_DEFAULT); > if (imageFile.length() > chunkSize) { > // using chunked streaming mode to support upload of 2GB+ files and to > // avoid internal buffering. > // this mode should be used only if more than chunkSize data is present > // to upload. otherwise upload may not happen sometimes. > connection.setChunkedStreamingMode(chunkSize); > } > {code} > There is no check code for this parameter. User may accidentally set this > value to a wrong value. Here, if the user set chunkSize to a negative value. > Chunked streaming mode will always be used. In > setChunkedStreamingMode(chunkSize), there is a correction code that if the > chunkSize is <=0, it will be change to DEFAULT_CHUNK_SIZE. > {code:java} > public void setChunkedStreamingMode (int chunklen) { > if (connected) { > throw new IllegalStateException ("Can't set streaming mode: already > connected"); > } > if (fixedContentLength != -1 || fixedContentLengthLong != -1) { > throw new IllegalStateException ("Fixed length streaming mode set"); > } > chunkLength = chunklen <=0? DEFAULT_CHUNK_SIZE : chunklen; > } > {code} > However, > *If the user set dfs.image.transfer.chunksize to value that <= 0, even for > images whose imageFile.length() < DEFAULT_CHUNK_SIZE will use chunked > streaming mode and may fail the upload as mentioned above.* *(This scenario > may not be common, but* *we can prevent users setting this param to an > extremely small value.**)* > *How to fix:* > Add checking code or correction code right after parsing the config value > before really use the value (setChunkedStreamingMode). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15442) Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative value
[ https://issues.apache.org/jira/browse/HDFS-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15442: Attachment: (was: HDFS-15442.000.patch) > Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative > value > --- > > Key: HDFS-15442 > URL: https://issues.apache.org/jira/browse/HDFS-15442 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15442.000.patch > > > In current implementation of checkpoint image transfer, if the file length is > bigger than the configured value dfs.image.transfer.chunksize, it will use > chunked streaming mode to avoid internal buffering. This mode should be used > only if more than chunkSize data is present to upload, otherwise upload may > not happen sometimes. > {code:java} > //TransferFsImage.java > int chunkSize = (int) conf.getLongBytes( > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_KEY, > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_DEFAULT); > if (imageFile.length() > chunkSize) { > // using chunked streaming mode to support upload of 2GB+ files and to > // avoid internal buffering. > // this mode should be used only if more than chunkSize data is present > // to upload. otherwise upload may not happen sometimes. > connection.setChunkedStreamingMode(chunkSize); > } > {code} > There is no check code for this parameter. User may accidentally set this > value to a wrong value. Here, if the user set chunkSize to a negative value. > Chunked streaming mode will always be used. In > setChunkedStreamingMode(chunkSize), there is a correction code that if the > chunkSize is <=0, it will be change to DEFAULT_CHUNK_SIZE. > {code:java} > public void setChunkedStreamingMode (int chunklen) { > if (connected) { > throw new IllegalStateException ("Can't set streaming mode: already > connected"); > } > if (fixedContentLength != -1 || fixedContentLengthLong != -1) { > throw new IllegalStateException ("Fixed length streaming mode set"); > } > chunkLength = chunklen <=0? DEFAULT_CHUNK_SIZE : chunklen; > } > {code} > However, > *If the user set dfs.image.transfer.chunksize to value that <= 0, even for > images whose imageFile.length() < DEFAULT_CHUNK_SIZE will use chunked > streaming mode and may fail the upload as mentioned above.* *(This scenario > may not be common, but* *we can prevent users setting this param to an > extremely small value.**)* > *How to fix:* > Add checking code or correction code right after parsing the config value > before really use the value (setChunkedStreamingMode). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15442) Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative value
[ https://issues.apache.org/jira/browse/HDFS-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15442: Attachment: HDFS-15442.000.patch > Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative > value > --- > > Key: HDFS-15442 > URL: https://issues.apache.org/jira/browse/HDFS-15442 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15442.000.patch > > > In current implementation of checkpoint image transfer, if the file length is > bigger than the configured value dfs.image.transfer.chunksize, it will use > chunked streaming mode to avoid internal buffering. This mode should be used > only if more than chunkSize data is present to upload, otherwise upload may > not happen sometimes. > {code:java} > //TransferFsImage.java > int chunkSize = (int) conf.getLongBytes( > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_KEY, > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_DEFAULT); > if (imageFile.length() > chunkSize) { > // using chunked streaming mode to support upload of 2GB+ files and to > // avoid internal buffering. > // this mode should be used only if more than chunkSize data is present > // to upload. otherwise upload may not happen sometimes. > connection.setChunkedStreamingMode(chunkSize); > } > {code} > There is no check code for this parameter. User may accidentally set this > value to a wrong value. Here, if the user set chunkSize to a negative value. > Chunked streaming mode will always be used. In > setChunkedStreamingMode(chunkSize), there is a correction code that if the > chunkSize is <=0, it will be change to DEFAULT_CHUNK_SIZE. > {code:java} > public void setChunkedStreamingMode (int chunklen) { > if (connected) { > throw new IllegalStateException ("Can't set streaming mode: already > connected"); > } > if (fixedContentLength != -1 || fixedContentLengthLong != -1) { > throw new IllegalStateException ("Fixed length streaming mode set"); > } > chunkLength = chunklen <=0? DEFAULT_CHUNK_SIZE : chunklen; > } > {code} > However, > *If the user set dfs.image.transfer.chunksize to value that <= 0, even for > images whose imageFile.length() < DEFAULT_CHUNK_SIZE will use chunked > streaming mode and may fail the upload as mentioned above.* *(This scenario > may not be common, but* *we can prevent users setting this param to an > extremely small value.**)* > *How to fix:* > Add checking code or correction code right after parsing the config value > before really use the value (setChunkedStreamingMode). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15442) Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative value
[ https://issues.apache.org/jira/browse/HDFS-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] AMC-team updated HDFS-15442: Attachment: (was: HDFS-15442.000.patch) > Image upload may fail if dfs.image.transfer.chunksize wrongly set to negative > value > --- > > Key: HDFS-15442 > URL: https://issues.apache.org/jira/browse/HDFS-15442 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: AMC-team >Priority: Major > Attachments: HDFS-15442.000.patch > > > In current implementation of checkpoint image transfer, if the file length is > bigger than the configured value dfs.image.transfer.chunksize, it will use > chunked streaming mode to avoid internal buffering. This mode should be used > only if more than chunkSize data is present to upload, otherwise upload may > not happen sometimes. > {code:java} > //TransferFsImage.java > int chunkSize = (int) conf.getLongBytes( > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_KEY, > DFSConfigKeys.DFS_IMAGE_TRANSFER_CHUNKSIZE_DEFAULT); > if (imageFile.length() > chunkSize) { > // using chunked streaming mode to support upload of 2GB+ files and to > // avoid internal buffering. > // this mode should be used only if more than chunkSize data is present > // to upload. otherwise upload may not happen sometimes. > connection.setChunkedStreamingMode(chunkSize); > } > {code} > There is no check code for this parameter. User may accidentally set this > value to a wrong value. Here, if the user set chunkSize to a negative value. > Chunked streaming mode will always be used. In > setChunkedStreamingMode(chunkSize), there is a correction code that if the > chunkSize is <=0, it will be change to DEFAULT_CHUNK_SIZE. > {code:java} > public void setChunkedStreamingMode (int chunklen) { > if (connected) { > throw new IllegalStateException ("Can't set streaming mode: already > connected"); > } > if (fixedContentLength != -1 || fixedContentLengthLong != -1) { > throw new IllegalStateException ("Fixed length streaming mode set"); > } > chunkLength = chunklen <=0? DEFAULT_CHUNK_SIZE : chunklen; > } > {code} > However, > *If the user set dfs.image.transfer.chunksize to value that <= 0, even for > images whose imageFile.length() < DEFAULT_CHUNK_SIZE will use chunked > streaming mode and may fail the upload as mentioned above.* *(This scenario > may not be common, but* *we can prevent users setting this param to an > extremely small value.**)* > *How to fix:* > Add checking code or correction code right after parsing the config value > before really use the value (setChunkedStreamingMode). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS
[ https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=486810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486810 ] ASF GitHub Bot logged work on HDFS-15025: - Author: ASF GitHub Bot Created on: 21/Sep/20 08:11 Start Date: 21/Sep/20 08:11 Worklog Time Spent: 10m Work Description: liuml07 commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r491859739 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockStatsMXBean.java ## @@ -145,9 +150,11 @@ public void testStorageTypeStatsJMX() throws Exception { Map storageTypeStats = (Map)entry.get("value"); typesPresent.add(storageType); if (storageType.equals("ARCHIVE") || storageType.equals("DISK") ) { -assertEquals(3l, storageTypeStats.get("nodesInService")); +assertEquals(3L, storageTypeStats.get("nodesInService")); Review comment: Hadoop releases before 2.10 are all end of life (EoL). Hadoop 2.10 is the only version using Java 7. We do not need any support, compile or runtime, for Java versions before Java 7. Hadoop 3.x are all using Java 8+. We do not need any Java 7 support in Hadoop 3. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486810) Time Spent: 7h (was: 6h 50m) > Applying NVDIMM storage media to HDFS > - > > Key: HDFS-15025 > URL: https://issues.apache.org/jira/browse/HDFS-15025 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: YaYun Wang >Assignee: YaYun Wang >Priority: Major > Labels: pull-request-available > Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, > HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, > HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch > > Time Spent: 7h > Remaining Estimate: 0h > > The non-volatile memory NVDIMM is faster than SSD, it can be used > simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on > NVDIMM can not only improves the response rate of HDFS, but also ensure the > reliability of the data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS
[ https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=486809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486809 ] ASF GitHub Bot logged work on HDFS-15025: - Author: ASF GitHub Bot Created on: 21/Sep/20 08:07 Start Date: 21/Sep/20 08:07 Worklog Time Spent: 10m Work Description: liuml07 commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r491857797 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockStatsMXBean.java ## @@ -145,9 +150,11 @@ public void testStorageTypeStatsJMX() throws Exception { Map storageTypeStats = (Map)entry.get("value"); typesPresent.add(storageType); if (storageType.equals("ARCHIVE") || storageType.equals("DISK") ) { -assertEquals(3l, storageTypeStats.get("nodesInService")); +assertEquals(3L, storageTypeStats.get("nodesInService")); Review comment: I have not used Java 7 for a while, but I remember vaguely this is actually supported? https://docs.oracle.com/javase/specs/jls/se7/html/jls-14.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486809) Time Spent: 6h 50m (was: 6h 40m) > Applying NVDIMM storage media to HDFS > - > > Key: HDFS-15025 > URL: https://issues.apache.org/jira/browse/HDFS-15025 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: YaYun Wang >Assignee: YaYun Wang >Priority: Major > Labels: pull-request-available > Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, > HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, > HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch > > Time Spent: 6h 50m > Remaining Estimate: 0h > > The non-volatile memory NVDIMM is faster than SSD, it can be used > simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on > NVDIMM can not only improves the response rate of HDFS, but also ensure the > reliability of the data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15025) Applying NVDIMM storage media to HDFS
[ https://issues.apache.org/jira/browse/HDFS-15025?focusedWorklogId=486806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486806 ] ASF GitHub Bot logged work on HDFS-15025: - Author: ASF GitHub Bot Created on: 21/Sep/20 07:56 Start Date: 21/Sep/20 07:56 Worklog Time Spent: 10m Work Description: YaYun-Wang commented on a change in pull request #2189: URL: https://github.com/apache/hadoop/pull/2189#discussion_r491852471 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockStatsMXBean.java ## @@ -145,9 +150,11 @@ public void testStorageTypeStatsJMX() throws Exception { Map storageTypeStats = (Map)entry.get("value"); typesPresent.add(storageType); if (storageType.equals("ARCHIVE") || storageType.equals("DISK") ) { -assertEquals(3l, storageTypeStats.get("nodesInService")); +assertEquals(3L, storageTypeStats.get("nodesInService")); Review comment: `storageType` is a parameter of "java.lang.String" , and `switch()` does not support "java.lang.String" before java 1.7. So, will `if-else ` be more appropriate here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 486806) Time Spent: 6h 40m (was: 6.5h) > Applying NVDIMM storage media to HDFS > - > > Key: HDFS-15025 > URL: https://issues.apache.org/jira/browse/HDFS-15025 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Reporter: YaYun Wang >Assignee: YaYun Wang >Priority: Major > Labels: pull-request-available > Attachments: Applying NVDIMM to HDFS.pdf, HDFS-15025.001.patch, > HDFS-15025.002.patch, HDFS-15025.003.patch, HDFS-15025.004.patch, > HDFS-15025.005.patch, HDFS-15025.006.patch, NVDIMM_patch(WIP).patch > > Time Spent: 6h 40m > Remaining Estimate: 0h > > The non-volatile memory NVDIMM is faster than SSD, it can be used > simultaneously with RAM, DISK, SSD. The data of HDFS stored directly on > NVDIMM can not only improves the response rate of HDFS, but also ensure the > reliability of the data. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15548) Allow configuring DISK/ARCHIVE storage types on same device mount
[ https://issues.apache.org/jira/browse/HDFS-15548?focusedWorklogId=486799=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486799 ] ASF GitHub Bot logged work on HDFS-15548: - Author: ASF GitHub Bot Created on: 21/Sep/20 07:34 Start Date: 21/Sep/20 07:34 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2288: URL: https://github.com/apache/hadoop/pull/2288#issuecomment-695952357 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 13s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 4s | trunk passed | | +1 :green_heart: | compile | 1m 54s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 1m 33s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 1m 11s | trunk passed | | +1 :green_heart: | mvnsite | 1m 48s | trunk passed | | +1 :green_heart: | shadedclient | 20m 39s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 9s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 39s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 4m 17s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 4m 15s | trunk passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 38s | the patch passed | | +1 :green_heart: | compile | 1m 39s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | javac | 1m 39s | hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 2 new + 602 unchanged - 0 fixed = 604 total (was 602) | | +1 :green_heart: | compile | 1m 26s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | javac | 1m 26s | hadoop-hdfs-project_hadoop-hdfs-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 2 new + 586 unchanged - 0 fixed = 588 total (was 586) | | +1 :green_heart: | checkstyle | 1m 6s | the patch passed | | +1 :green_heart: | mvnsite | 1m 38s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 2s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 17m 18s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 3s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 1m 38s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 4m 23s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 145m 33s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | asflicense | 1m 13s | The patch does not generate ASF License warnings. | | | | 251m 57s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery | | | hadoop.hdfs.TestSnapshotCommands | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | | hadoop.hdfs.server.namenode.ha.TestUpdateBlockTailing | | | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks | | | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | hadoop.hdfs.server.datanode.TestDataNodeUUID | | Subsystem | Report/Notes |
[jira] [Work logged] (HDFS-15557) Log the reason why a storage log file can't be deleted
[ https://issues.apache.org/jira/browse/HDFS-15557?focusedWorklogId=486780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486780 ] ASF GitHub Bot logged work on HDFS-15557: - Author: ASF GitHub Bot Created on: 21/Sep/20 06:16 Start Date: 21/Sep/20 06:16 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2274: URL: https://github.com/apache/hadoop/pull/2274#issuecomment-695921398 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 31m 11s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | -1 :x: | mvninstall | 16m 16s | root in trunk failed. | | -1 :x: | compile | 0m 17s | hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 0m 26s | hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | +1 :green_heart: | checkstyle | 0m 51s | trunk passed | | -1 :x: | mvnsite | 0m 25s | hadoop-hdfs in trunk failed. | | +1 :green_heart: | shadedclient | 1m 49s | branch has no errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 26s | hadoop-hdfs in trunk failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javadoc | 0m 26s | hadoop-hdfs in trunk failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | +0 :ok: | spotbugs | 3m 8s | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 :x: | findbugs | 0m 25s | hadoop-hdfs in trunk failed. | ||| _ Patch Compile Tests _ | | -1 :x: | mvninstall | 0m 22s | hadoop-hdfs in the patch failed. | | -1 :x: | compile | 0m 22s | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javac | 0m 22s | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | compile | 0m 21s | hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | javac | 0m 21s | hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -0 :warning: | checkstyle | 0m 20s | The patch fails to run checkstyle in hadoop-hdfs | | -1 :x: | mvnsite | 0m 23s | hadoop-hdfs in the patch failed. | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 0m 21s | patch has no errors when building and testing our client artifacts. | | -1 :x: | javadoc | 0m 23s | hadoop-hdfs in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1. | | -1 :x: | javadoc | 0m 23s | hadoop-hdfs in the patch failed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01. | | -1 :x: | findbugs | 0m 22s | hadoop-hdfs in the patch failed. | ||| _ Other Tests _ | | -1 :x: | unit | 0m 21s | hadoop-hdfs in the patch failed. | | +0 :ok: | asflicense | 0m 22s | ASF License check generated no output? | | | | 59m 1s | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2274/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2274 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 62e70170050b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 7a6265ac425 | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | mvninstall | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2274/4/artifact/out/branch-mvninstall-root.txt | | compile |