[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

Colin Patrick McCabe (JIRA) Tue, 16 Dec 2014 18:14:34 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249317#comment-14249317
 ]


Colin Patrick McCabe commented on HDFS-7430:
--------------------------------------------

bq. getSubdirEntries, we can directly set long now = Time.monotonicNow and save 
the else case.

ok

bq. It'd be nicer to use Paths.get rather than stringing together all these 
File objects. Yay JDK7!

good idea

bq. getNextSubDir, getNextFinalizedDir, getNextFinalizedSubDir, 
getSubdirEntries don't need to throw IOException, neither does nextBlock. I'm a 
bit surprised, but it turns out listing dirs only throws a RuntimeException on 
permission error (which we shouldn't have to worry about). Can clean the 
resulting try/catch in VolumeScanner too.

Let me fix this so that we return {{IOException}} from these functions.  We can 
use the JDK7 APIs to do this.

bq. Is it worth doing rename tricks to atomically save the cursor file? I 
remember reading an LWN article about this being surprisingly hard to do 
correctly, so maybe worth looking that up too. Could also just double buffer 
and look at ctimes too.

Sure.  Let's write to a temporary file and then rename once we're done.  That's 
easy to do.

rename is atomic on almost all production systems.  Anyway, even if we end up 
with an empty or missing cursor file once in a blue moon, it just means we 
restart the scan.

bq. The cache timeout seems rather low. With the default params of 1MB/s and 
30s timeout, we'll timeout each time if blocks are bigger than 30MB.

Good point.  I will up it to 15 minutes, that seems reasonable for this and 
should let us scan a few blocks in between.

bq. Is this OBO? If missing, the return value is -index - 1. With the code as 
it is, we'll never return 0, which is what we want if the first item in the 
list was deleted.

Yeah, I think this is wrong.  Will change.

bq. Spacing is a little off with rest of file

This is actually a problem with the other keys at the end of the file, not with 
the new keys I added.  If you look carefully, you can see they're spaced with 3 
spaces rather than 2 :P

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> ---------------------------------------------------------------------
>
>                 Key: HDFS-7430
>                 URL: https://issues.apache.org/jira/browse/HDFS-7430
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.7.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
> HDFS-7430.004.patch, HDFS-7430.005.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by 
> keeping track of what block was scanned last, rather than by tracking the 
> scan status of all blocks in memory.  Also, instead of having just one 
> thread, we should have a verification thread per hard disk (or other volume), 
> scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

Reply via email to