I think this is the wrong angle to go about it - like you mentioned in your
first post, the Linux file system cache *should* be taking care of this for
us. That it is not is a fault of the current implementation and not an
inherent problem.

I think one solution is HDFS-347 - I'm putting the finishing touches on a
design doc for that JIRA and should have it up in the next day or two.

-Todd

On Tue, Oct 6, 2009 at 5:25 PM, Edward Capriolo <[email protected]>wrote:

> On Tue, Oct 6, 2009 at 6:12 PM, Aaron Kimball <[email protected]> wrote:
> > Edward,
> >
> > Interesting concept. I imagine that implementing "CachedInputFormat" over
> > something like memcached would make for the most straightforward
> > implementation. You could store 64MB chunks in memcached and try to
> retrieve
> > them from there, falling back to the filesystem on failure. One obvious
> > potential drawback of this is that a memcached cluster might store those
> > blocks on different servers than the file chunks themselves, leading to
> an
> > increased number of network transfers during the mapping phase. I don't
> know
> > if it's possible to "pin" the objects in memcached to particular nodes;
> > you'd want to do this for mapper locality reasons.
> >
> > I would say, though, that 1 GB out of 8 GB on a datanode is somewhat
> > ambitious. It's been my observation that people tend to write
> memory-hungry
> > mappers. If you've got 8 cores in a node, and 1 GB each have already gone
> to
> > the OS, the datanode, and the tasktracker, that leaves only 5 GB for task
> > processes. Running 6 or 8 map tasks concurrently can easily gobble that
> up.
> > On a 16 GB datanode with 8 cores, you might get that much wiggle room
> > though.
> >
> > - Aaron
> >
> >
> > On Tue, Oct 6, 2009 at 8:16 AM, Edward Capriolo <[email protected]
> >wrote:
> >
> >> After looking at the HBaseRegionServer and its functionality, I began
> >> wondering if there is a more general use case for memory caching of
> >> HDFS blocks/files. In many use cases people wish to store data on
> >> Hadoop indefinitely, however the last day,last week, last month, data
> >> is probably the most actively used. For some Hadoop clusters the
> >> amount of raw new data could be less then the RAM memory in the
> >> cluster.
> >>
> >> Also some data will be used repeatedly, the same source data may be
> >> used to generate multiple result sets, and those results may be used
> >> as the input to other processes.
> >>
> >> I am thinking an answer could be to dedicate an amount of physical
> >> memory on each DataNode, or on several dedicated node to a distributed
> >> memcache like layer. Managing this cache should be straight forward
> >> since hadoop blocks are pretty much static. (So say for a DataNode
> >> with 8 GB of memory dedicate 1GB to HadoopCacheServer.) If you had
> >> 1000 Nodes that cache would be quite large.
> >>
> >> Additionally we could create a new file system type cachedhdfs
> >> implemented as a facade, or possibly implement CachedInputFormat or
> >> CachedOutputFormat.
> >>
> >> I know that the underlying filesystems have cache, but I think Hadoop
> >> writing intermediate data is going to evict some of the data which
> >> "should be" semi-permanent.
> >>
> >> So has anyone looked into something like this? This was the closest
> >> thing I found.
> >>
> >> http://issues.apache.org/jira/browse/HADOOP-288
> >>
> >> My goal here is to keep recent data in memory so that tools like Hive
> >> can get a big boost on queries for new data.
> >>
> >> Does anyone have any ideas?
> >>
> >
>
> Aaron,
>
> Yes 1GB out of 8GB was just an arbitrary value I decided. Remember
> that 16K of ram did get a man to the moon. :) I am thinking the value
> would be configurable, say dfs.cache.mb.
>
> Also there is the details of cache eviction, or possibly including and
> excluding paths and files.
>
> Other then the InputFormat concept we could plug the cache in directly
> into the DFSclient. In this way the cache would always end up on the
> node where the data was. Otherwise the InputFormat will have to manage
> that which would be a lot of work. I think if we prove the concept we
> can then follow up and get it more optimized.
>
> I am poking around the Hadoop internals to see what options we have.
> My first implementation I will probably patch some code, run some
> tests, profile performance.
>

Reply via email to