Good good question. One straightforward way to approach things is to compute all recommendations offline, in batch, and publish them to some location, and then simply read them as needed. Yes your front-end would need to access HDFS if the data were on HDFS. The downside is that you can't update in real-time, and you spend CPU computing recs for people that may never be needed.
The online implementations you've been playing with don't have those two problems, but they have scale issues at some point. But, I think one of these two approaches is probably 'just fine' for 80% of use cases. If not, the 'real' answer is a hybrid solution, using Hadoop to do periodic model recomputation, offline, and using front-ends to do (at least approximate) real-time updates and computation. This sort of system is what I'm trying to build with Myrrix (myrrix.com), which you may be interested in if you have this kind of problem. On Fri, Aug 3, 2012 at 6:16 PM, Matt Mitchell <[email protected]> wrote: > Thanks Sean, that makes sense. I'll look into the source and see if I > can find learn more. > > Another question. I understand how the recommendations are created. > I'd like to wrap this all up as a web service, but I'm not sure I > understand how one would go about doing that? How would one app, fetch > recomendations for a user? Does my app need access to the HDFS file > system? > > Thanks again. > >
