Hi Nick, On 27.12.2013, at 17:34, Nick Martin wrote:
> Hi all, have a question re: tutorials... > > At my org we use Mahout primarily for recommendations - both eComm and > offline for other purposes. I'm new to Mahout (~9 months of toying around) > and honestly I never actually "touch" the code - I simply use what's > "out-of-the-box" and occasionally ping the @user list for questions and > always seem to get pointed in the right direction so that's been a huge > help for us and I can't thank the community enough. > > I've spent the past several months documenting what we've gone through to > get our recommendation pipeline up and running but it comes from a more > "business-y" angle. Publish it and post a link to the user list. > ... > > It took me a while to piece together everything we could do with > recommendations from various blogs, MIA, the Mahout wiki, slideshare, my > own experimentation, etc. and I kept hoping I'd find a consolidated source > for "this is how you can start to apply Mahout's recommendation > capabilities in a fairly traditional commercial organization" but never > stumbled across a resource that fit the bill. MIA was great in getting me > to think about what might be possible but finding real-world examples was > tough. This is what you normally pay companies with consultants for. Nobody will tell you how to use a recommender to become rich :-) I wrote my master thesis about a similar topic: An architecture for evaluating recommender systems in real world scenarios http://www.slideshare.net/ManuelB86/an-architecture-for-evaluating-recommender-systems-in-real-world-scenarios If you are interested in the full thesis (97 pages) you can send me a private email. > > Personally, I would've loved to have something that gave me a little more > in the "applied techniques" space that expanded on the basics of getting > me in the right CLI syntax to kick off a recommendation job. I don't thing that this will happen. Developing these strategies are often considered as business secrets and competitive advantage and I would say they are. > My experience > was/is that probably 30% of my time was spent learning how to get Mahout > deployed/integrated with our Hadoop cluster, kickoff the relevant jobs, get > the results into a useable format, etc. and 70% thinking about how best to > bring this capability into the fold for driving business results. This is the same for most of the people here. It is not about the algorithms. It is about finding which algorithm makes a difference. Most of the time you will find no significant change if you tune your current algorithm. These tiny little bits that make a difference are very valuable knowledge. There are people paying thousands of dollars for optimizing: http://contest.ipinyou.com http://overstockreclabprize.com/ http://contest.plista.com/ http://www.kaggle.com/ So if you are willing to pay I would guess there will be some people writing documentation for you and are making well educated suggestions e.g. me ;-) > So, would > it prove worthwhile to have something like this as a component of the > Mahout wiki? The wiki migrated to Apache CMS. If you want to integrate some content open a jira issue and append you content in any format to this issue. Assign it to Isabel. She took responsibility for the new CMS. > If so, I'm happy to condense some of my documentation and send > to the group for thoughts and potential inclusion. I'm not the guy you want > helping fix bugs or cleaning up code but I spend my days with "boots on the > ground" implementation of Mahout in a Fortune 15 organization and I have to > believe there are others out there in my position looking for help on how > to integrate Mahout into their operations. > > Best, > > Nick Hope that helps Manuel -- Manuel Blechschmidt M.Sc. IT Systems Engineering Dortustr. 57 14467 Potsdam Mobil: 0173/6322621 Twitter: http://twitter.com/Manuel_B
