[ 
https://issues.apache.org/jira/browse/HDFS-14531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861176#comment-16861176
 ] 

Todd Lipcon commented on HDFS-14531:
------------------------------------

It seems the DirectoryScanner serves two purposes: (1) detect blocks in memory 
that are missing from disk, and (2) detect blocks on disk that are missing from 
memory. In most cases, the first of these two purposes is much more important, 
since it protects against file system bugs/corruptions or naughty 
administrators accidentally removing files underneath the datanode. Accidental 
addition of block files is far less likely, and a purposeful addition (eg 
during some manual restore procedure) can always restart the DN to pick them up.

Given that, I think we could keep the functionality but reduce the memory usage 
by a few orders of magnitude using bloom filters:
- when scanning the disk, instead of getting a ScanInfo for each block, instead 
populate a bloom filter. For the example here of 1M replicas, you can get a 
blooom filter with 0.1% FP rate with only 1.7MB of RAM.
- when reconciling, check each in-memory block to see if it's present in the 
bloom filter. If the bloom filter says "not present", you know you have a 
missing block. If it says "present", you have a 0.1% chance of not detecting a 
missing block

In order to guarantee that we eventually detect missing blocks, each pass of 
this algorithm can use a different hash seed for the bloom filter. This ends up 
reducing the FP rate after N passes to FP^N (eg after two passes of 0.1% FP 
rate, the FP rate is 0.0001%). So, within a few passes, the probability of 
undetected corruption will shrink to a smaller value than many other 
undetectable errors on HDFS (eg after four passes, of FP=10^-3, the probability 
of missing a block would be less than the probability of an undetected 32-bit 
checksum error)

Of course this doesn't solve the IO issue that Nathan mentioned, but that could 
be addressed by throttling.



> Datanode's ScanInfo requires excessive memory
> ---------------------------------------------
>
>                 Key: HDFS-14531
>                 URL: https://issues.apache.org/jira/browse/HDFS-14531
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Priority: Major
>         Attachments: Screen Shot 2019-05-31 at 12.25.54 PM.png
>
>
> The DirectoryScanner's ScanInfo map consumes ~4.5X memory as replicas as the 
> replica map.  For 1.1M replicas: the replica map is ~91M while the scan info 
> is ~405M.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to