Hi all, in my travels I've come across a small interesting startup that I thought might be of interest to the user@ audience. It's MapFreeduce ( http://mapfreeduce.com/), and they're spinning an interesting twist on MapReduce. They've constructed a simplified MapReduce API, one for which workers are able to run as Java applets in the browser sandbox.
It's interesting for two reasons, I can tell you, after playing with it myself. One, I think it's interesting as it asks whether a simpler version of MapReduce than what you get in Hadoop is viable. That is -- it's not Hadoop. Can you do something interesting without, say, direct access to HDFS? Combiners? custom InputFormats? And two, since it can fairly automatically turn office PCs with a browser into a safe background MR worker, might let organizational skunk-works create a cluster for cheap out of truly unused cycles to do something interesting. I managed to reconstruct parts of the recommender pipeline on this framework without too much modification. It is possible to 'port' some parts of Mahout to this framework, if not all. MapReduce fans will probably enjoy taking a look at what they can get away with in a browser sandbox. >From a conversation with their founder I know they'd really like feedback and testers. Here's their pitch and plea for beta users in their own words. (I have no affiliation with or interest in the company.) *"MapFreeduce.com is a Washington DC-based startup making Big Data accessible to everyone. Our software service enables users to quickly and easily build a mapreduce cluster from the spare CPU-cycles of available computers without installing or configuring any software. To add a node to your MapFreeduce cluster and increase its power, you simply click on a link from any idle computer. You can scale your cluster to thousands of nodes to perform computation- and data-intensive tasks such as web indexing, data mining, business analytics, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics research. MapFreeduce allows you to focus on crunching your data without having to worry about either the cost and complexity of setting up a traditional hardware cluster or the perpetual fees charged per hour and per node by common cloud providers. We are looking for individuals that would be interested in joining our free, private beta test and/or providing feedback to our service."*
