[jira] [Updated] (HDFS-5096) Automatically cache new data added to a cached path

Colin Patrick McCabe (JIRA) Wed, 09 Oct 2013 17:24:22 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-5096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Colin Patrick McCabe updated HDFS-5096:
---------------------------------------

    Attachment: HDFS-5096-caching.005.patch

This patch changes the way we do caching on the backend a bit.  The new 
approach is based on periodic scanning of the namespace.  The scan interval is 
controlled by {{dfs.namenode.path.based.cache.refresh.interval.ms}}.  The time 
complexity is proportional to O(num_path_based_cache_entries * 
num_blocks_per_PBCE).

Eventually, it would be nice to have a more edge-triggered approach to caching. 
 However, I think that we are always going to want a scanner, to make sure that 
we didn't miss anything.  So we might as well do the scanner first, since it 
covers most of the use cases we want.  It handles issues like correctly 
handling still-open files with non-finalized blocks, as well as the huge number 
of operations that move inodes, without a lot of struggle and/or bugginess.

I added "replication" as a field in PBCDi, PBCDe, PBCE, and plumbed it through 
RPC, edit log, and the fsimage.  We can now control how many cached replicas we 
want, rather than just blindly caching on every DN as before.

The {{CacheManager}} no longer has its own internal lock.  Instead, it relies 
on the {{FSNamesystem}} lock.  This is better, since we are holding the FSN 
lock in every case we want to call into the {{CacheManager}}.  It makes 
coordination between the CM and other components a lot easier and eliminates 
some suspect locking scenarios.

I unified {{CacheReplicationManager}} and {{CacheManager}}.  Both of these 
classes were doing the same thing using the same shared state, so it made sense 
to only have one.  The scanner code remains in {{CacheReplicationMonitor}} to 
avoid {{CacheManager}} getting too big.  I wanted to use an {{Executor}} for 
the CRMon, but I was unable to find one that supported both scheduling the task 
at a certain rate, and "poking" the task to get it to run immediately.  So I 
just used a {{Thread}}, which is simple enough, I think.  Fixed some improper 
use of wall-clock time where monotonic time was needed while I was in there.

On the DN side, I switched some of the logic to use block IDs rather than Block 
objects.  We should not need genstamp and block length here, since that is 
taken care of elsewhere.  For now, those things remain in the cached block 
report and DNA_CACHE, etc., but we should take them out soon since they're not 
needed and take up space on the wire.

I also fixed an error message that was inverted (it was saying the configured 
mlock space was "less than the datanodes's available", but really it should 
have been *more*).  This could be split out into a separate patch, but it's a 
tiny change, so I rolled it up into here.

Cached block state is stored in a slightly different data structure.  
Previously, we were re-using a lot of the {{BlockManager}} structures such as 
{{BlockInfo}} and {{BlocksMap}}.  However, they're poorly suited in some ways, 
since they have fields we don't care about, and lack some fields we do.  Also, 
using them raises the possibility of putting one of our {{BlockInfo}} objects 
into one of their {{BlocksMap}} structures, which would create havoc.

The new structure has nodes of type {{CachedBlock}}.  Each {{CachedBlock}} 
object can be a member of several implicit linked lists.  Each 
{{DatanodeDescriptor}} has three caching-related lists: pendingCached, cached, 
and pendingUncached.  As the names imply, they track the blocks which are in 
that state with regard to that DN.

> Automatically cache new data added to a cached path
> ---------------------------------------------------
>
>                 Key: HDFS-5096
>                 URL: https://issues.apache.org/jira/browse/HDFS-5096
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>            Reporter: Andrew Wang
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5096-caching.005.patch
>
>
> For some applications, it's convenient to specify a path to cache, and have 
> HDFS automatically cache new data added to the path without sending a new 
> caching request or a manual refresh command.
> One example is new data appended to a cached file. It would be nice to 
> re-cache a block at the new appended length, and cache new blocks added to 
> the file.
> Another example is a cached Hive partition directory, where a user can drop 
> new files directly into the partition. It would be nice if these new files 
> were cached.
> In both cases, this automatic caching would happen after the file is closed, 
> i.e. block replica is finalized.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5096) Automatically cache new data added to a cached path

Reply via email to