[jira] [Updated] (HDFS-12102) VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt block
[ https://issues.apache.org/jira/browse/HDFS-12102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin Ramesh updated HDFS-12102: - Attachment: HDFS-12102-003.patch > VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt > block > > > Key: HDFS-12102 > URL: https://issues.apache.org/jira/browse/HDFS-12102 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Affects Versions: 2.8.2 >Reporter: Ashwin Ramesh >Priority: Minor > Fix For: 2.8.2 > > Attachments: HDFS-12102-001.patch, HDFS-12102-002.patch, > HDFS-12102-003.patch > > > When the Volume scanner sees a corrupt block, it restarts the scan and scans > the blocks at much faster rate with a negligible scan period. This is so that > it doesn't take 3 weeks to report blocks since a corrupt block means > increased likelihood that there are more corrupt blocks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12102) VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt block
[ https://issues.apache.org/jira/browse/HDFS-12102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin Ramesh updated HDFS-12102: - Status: Patch Available (was: Open) > VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt > block > > > Key: HDFS-12102 > URL: https://issues.apache.org/jira/browse/HDFS-12102 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Affects Versions: 2.8.2 >Reporter: Ashwin Ramesh >Priority: Minor > Fix For: 2.8.2 > > Attachments: HDFS-12102-001.patch, HDFS-12102-002.patch > > > When the Volume scanner sees a corrupt block, it restarts the scan and scans > the blocks at much faster rate with a negligible scan period. This is so that > it doesn't take 3 weeks to report blocks since a corrupt block means > increased likelihood that there are more corrupt blocks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12102) VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt block
[ https://issues.apache.org/jira/browse/HDFS-12102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080593#comment-16080593 ] Ashwin Ramesh commented on HDFS-12102: -- [~nroberts] Added config options, removed unnecessary debug statements, set fast scan to disabled by default and is now enabled via a boolean in the config, and removed corruptBlockThreshold. [~arpitagarwal] The change is a volume scanner feature that starts a continuous and high bandwidth scan in order to scan a volume much more quickly when corruption has been determined to be highly likely. > VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt > block > > > Key: HDFS-12102 > URL: https://issues.apache.org/jira/browse/HDFS-12102 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Affects Versions: 2.8.2 >Reporter: Ashwin Ramesh >Priority: Minor > Fix For: 2.8.2 > > Attachments: HDFS-12102-001.patch, HDFS-12102-002.patch > > > When the Volume scanner sees a corrupt block, it restarts the scan and scans > the blocks at much faster rate with a negligible scan period. This is so that > it doesn't take 3 weeks to report blocks since a corrupt block means > increased likelihood that there are more corrupt blocks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12102) VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt block
[ https://issues.apache.org/jira/browse/HDFS-12102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin Ramesh updated HDFS-12102: - Attachment: HDFS-12102-002.patch > VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt > block > > > Key: HDFS-12102 > URL: https://issues.apache.org/jira/browse/HDFS-12102 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Affects Versions: 2.8.2 >Reporter: Ashwin Ramesh >Priority: Minor > Fix For: 2.8.2 > > Attachments: HDFS-12102-001.patch, HDFS-12102-002.patch > > > When the Volume scanner sees a corrupt block, it restarts the scan and scans > the blocks at much faster rate with a negligible scan period. This is so that > it doesn't take 3 weeks to report blocks since a corrupt block means > increased likelihood that there are more corrupt blocks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12102) VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt block
[ https://issues.apache.org/jira/browse/HDFS-12102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin Ramesh updated HDFS-12102: - Attachment: HDFS-12102-001.patch > VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt > block > > > Key: HDFS-12102 > URL: https://issues.apache.org/jira/browse/HDFS-12102 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs >Affects Versions: 2.8.2 >Reporter: Ashwin Ramesh >Priority: Minor > Fix For: 2.8.2 > > Attachments: HDFS-12102-001.patch > > > When the Volume scanner sees a corrupt block, it restarts the scan and scans > the blocks at much faster rate with a negligible scan period. This is so that > it doesn't take 3 weeks to report blocks since a corrupt block means > increased likelihood that there are more corrupt blocks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-12102) VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt block
Ashwin Ramesh created HDFS-12102: Summary: VolumeScanner throttle dropped (fast scan enabled) when there is a corrupt block Key: HDFS-12102 URL: https://issues.apache.org/jira/browse/HDFS-12102 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs Affects Versions: 2.8.2 Reporter: Ashwin Ramesh Priority: Minor Fix For: 2.8.2 When the Volume scanner sees a corrupt block, it restarts the scan and scans the blocks at much faster rate with a negligible scan period. This is so that it doesn't take 3 weeks to report blocks since a corrupt block means increased likelihood that there are more corrupt blocks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12092) VolumeScanner exits when block metadata file is corrupted on datanode.
[ https://issues.apache.org/jira/browse/HDFS-12092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin Ramesh updated HDFS-12092: - Description: Restarted a datanode, corrupted the metafile for blk_1073741825 with something like echo '' > blk_1073741825_1001.meta, and datanode logs reveal that the VolumeScanner exits due to an illegal argument exception. Here is the relevant trace: -- {code} 2017-07-05 22:03:41,878 [VolumeScannerThread()] DEBUG datanode.VolumeScanner: start scanning block BP-955735389-###-1494002319684:blk_1073741825_1001 2017-07-05 22:03:41,879 [VolumeScannerThread()] ERROR datanode.VolumeScanner: VolumeScanner() exiting because of exception java.lang.IllegalArgumentException: id=122 out of range [0, 5) at org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:67) at org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:123) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:178) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:142) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:156) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.loadLastPartialChunkChecksum(FsVolumeImpl.java:1022) at org.apache.hadoop.hdfs.server.datanode.FinalizedReplica.getLastChecksumAndDataLen(FinalizedReplica.java:104) at org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:259) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:484) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:614) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:699) 2017-07-05 22:03:41,879 [VolumeScannerThread()] INFO datanode.VolumeScanner: VolumeScanner() exiting. {code} was: Restarted a datanode, corrupted the metafile for blk_1073741825 with something like echo '' > blk_1073741825_1001.meta, and datanode logs reveal that the VolumeScanner exits due to an illegal argument exception. Here is the relevant trace 2017-07-05 22:03:41,878 [VolumeScannerThread(/grid/0/tmp/hadoop-hdfsqa/dfs/data)] DEBUG datanode.VolumeScanner: start scanning block BP-955735389-10.215.76.172-1494002319684:blk_1073741825_1001 2017-07-05 22:03:41,879 [VolumeScannerThread(/grid/0/tmp/hadoop-hdfsqa/dfs/data)] ERROR datanode.VolumeScanner: VolumeScanner(/grid/0/tmp/hadoop-hdfsqa/dfs/data, DS-7817e9a3-c179-4901-8757-af965b27b689) exiting because of exception java.lang.IllegalArgumentException: id=122 out of range [0, 5) at org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:67) at org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:123) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:178) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:142) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:156) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.loadLastPartialChunkChecksum(FsVolumeImpl.java:1022) at org.apache.hadoop.hdfs.server.datanode.FinalizedReplica.getLastChecksumAndDataLen(FinalizedReplica.java:104) at org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:259) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:484) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:614) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:699) 2017-07-05 22:03:41,879 [VolumeScannerThread(/grid/0/tmp/hadoop-hdfsqa/dfs/data)] INFO datanode.VolumeScanner: VolumeScanner(/grid/0/tmp/hadoop-hdfsqa/dfs/data, DS-7817e9a3-c179-4901-8757-af965b27b689) exiting. > VolumeScanner exits when block metadata file is corrupted on datanode. > -- > > Key: HDFS-12092 > URL: https://issues.apache.org/jira/browse/HDFS-12092 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs >Affects Versions: 2.8.0 >Reporter: Ashwin Ramesh > > Restarted a datanode, corrupted the metafile for blk_1073741825 with > something like echo '' > blk_1073741825_1001.meta, and datanode logs > reveal that the VolumeScanner exits due to an illegal argument exception. > Here is the relevant trace: >
[jira] [Created] (HDFS-12092) VolumeScanner exits when block metadata file is corrupted on datanode.
Ashwin Ramesh created HDFS-12092: Summary: VolumeScanner exits when block metadata file is corrupted on datanode. Key: HDFS-12092 URL: https://issues.apache.org/jira/browse/HDFS-12092 Project: Hadoop HDFS Issue Type: Bug Components: datanode, hdfs Affects Versions: 2.8.0 Reporter: Ashwin Ramesh Restarted a datanode, corrupted the metafile for blk_1073741825 with something like echo '' > blk_1073741825_1001.meta, and datanode logs reveal that the VolumeScanner exits due to an illegal argument exception. Here is the relevant trace 2017-07-05 22:03:41,878 [VolumeScannerThread(/grid/0/tmp/hadoop-hdfsqa/dfs/data)] DEBUG datanode.VolumeScanner: start scanning block BP-955735389-10.215.76.172-1494002319684:blk_1073741825_1001 2017-07-05 22:03:41,879 [VolumeScannerThread(/grid/0/tmp/hadoop-hdfsqa/dfs/data)] ERROR datanode.VolumeScanner: VolumeScanner(/grid/0/tmp/hadoop-hdfsqa/dfs/data, DS-7817e9a3-c179-4901-8757-af965b27b689) exiting because of exception java.lang.IllegalArgumentException: id=122 out of range [0, 5) at org.apache.hadoop.util.DataChecksum$Type.valueOf(DataChecksum.java:67) at org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:123) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:178) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:142) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:156) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.loadLastPartialChunkChecksum(FsVolumeImpl.java:1022) at org.apache.hadoop.hdfs.server.datanode.FinalizedReplica.getLastChecksumAndDataLen(FinalizedReplica.java:104) at org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:259) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:484) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:614) at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:699) 2017-07-05 22:03:41,879 [VolumeScannerThread(/grid/0/tmp/hadoop-hdfsqa/dfs/data)] INFO datanode.VolumeScanner: VolumeScanner(/grid/0/tmp/hadoop-hdfsqa/dfs/data, DS-7817e9a3-c179-4901-8757-af965b27b689) exiting. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org