Re: Ehcache and Mahout

Marko Ciric Fri, 30 Sep 2011 03:08:47 -0700

Actually, there are a whole set of "generic" classes that do caching by
using FastMap classes (you can checkout the source of Mahout from Apache
repo).
These implementations actually gives you the same effect as the EhCache - by
holding all data inside the memory.

The drawback of using only Mahout caching on the heap is that it happens
while constructing these objects (not incrementally, by loading data into
memory,
as can be implemented with EhCache). If you are not going to do distributed
calculations with MapReduce algorithms, you'll need caching to speed up.
If your data isn't to big and it can fit into JVM heap well, you can use
Mahout without EhCache but if you can't load all the data at once, you
should try to implement
your own caching (it is possible with EhCache itself) and make sure you
don't run out of memory manually.

On 11 September 2011 07:32, Ted Dunning <[email protected]> wrote:

> Caching in-process like this is likely to have much more satisfactory
> results than an external caching process.  Also, caching structures with
> repetitive access patterns is obviously better than caching single access
> data.  Thus caching small side data works well.  Map inputs do not.
>
> On Sat, Sep 10, 2011 at 6:28 PM, Robin Anil <[email protected]> wrote:
>
> > I once wrote a simple cache for HBaseDatastore in naive Bayes classifier
> > package and yes the speedup was really awesome, weights of high freq
> words
> > got cached and incremental lookup for rest of the words in a document was
> > really low. I had posted numbers on the old JIRA ticket
> >  On Sep 11, 2011 12:36 AM, "Dhruv Kumar" <[email protected]> wrote:
> > > Has anyone over here used EHcache with Mahout (or pure Hadoop jobs)?
> > >
> > > http://ehcache.org/
> > >
> > > For iterative MapReduce applications running on a NoSQL data store, it
> > > should provide a good performance boost by providing an in-memory
> object
> > > cache (I think). Any comments?
> >
>

-- 
--
Marko Ćirić
[email protected]

Re: Ehcache and Mahout

Reply via email to