Larger than memory is definitely useful. Larger than any single machines cumulative disk is probably a bit excessive (but nice).
On Thu, Mar 10, 2011 at 8:33 PM, deneche abdelhakim <[email protected]>wrote: > Ok, I am working on a new implementation of DecisionForests that should be > able to take real advantage of Hadoop's ability to handle really big > datasets. And by big datasets I mean datasets that are so big they cannot > fit on a single machine's storage disk. > > But I am wondering what are the real world applications of such an > implementation ? I mean, I want to make sure this implementation will be > really useful. >
