Re: Why Mahout bayes implementation is tightly coupled with Hadoop?

Sean Owen Thu, 19 Jan 2012 12:57:18 -0800

Hadoop does abstract storage as far as we're concerned -- it hides
behind InputFormat/OutputFormat implementations. There's no in-memory
InputFormat that I know of but you could surely write one. I don't
think it's somehow the storage abstraction that is the difference
between a plain Java and a Hadoop-based implementation.

It's either written to not have access to all data, in which case I
think you're going to always have something about shaped like the
Hadoop-based implementation of today, or it's written to be able to
access all data more or less at will, and then it ought be no more
complicated than a simple plain Java class indeed.

On Thu, Jan 19, 2012 at 8:32 PM, Mike Spreitzer <[email protected]> wrote:
> On the other hand, if Hadoop (or something like it) were based on a
> storage abstraction that had multiple implementations, say one in the
> client's memory and one in a cluster's disks (and maybe also others at
> other interesting points in between), and placement of computation were
> deferred to that store, then we could make Daniel happy both when
> developing and when doing real work on really large datasets.
>
> Regards,
> Mike

Re: Why Mahout bayes implementation is tightly coupled with Hadoop?

Reply via email to