If that 50GB represents 20million training examples for a classifier, then you are fine without hadoop.
If it is data to cluster or do SVD on, the answer is probably the same. This might be near the edge. If it is data for recommendations, that is a moderate amount and with or without hadoop is a bit of a toss-up for the offline processing. For frequent itemset, I couldn't say. On Sat, Oct 2, 2010 at 8:46 AM, Latency Buster <[email protected]>wrote: > > What did you want to do with Mahout? How much data do you have? > > > > There are many capabilities that don't use Hadoop, some that require it. > > Others allow you to choose to use > > Hadoop only when you need to scale to large volumes. > > > I have around 50GB data and need to do some data mining.. I do not > need realtime like performance and can live with slow performance... > > Can I assume that Hadoop is a 'not required' item in my case? > > Thanks, >
