On Sun, Oct 12, 2008 at 8:47 AM, Ian Holsman <[EMAIL PROTECTED]> wrote: > right.. we have put our 'real time' portion on the side lines for the > moment, and are have hadoop jobs running every X minutes to process the data > coming in.
Incidentally this sort of model is certainly what I recommend. I don't think real-time updates to recommenders are a good use of resources, let alone feasible in many cases. > we put the log files onto HDFS so that other things can read them and > process them. PS if you have suggestions for improvements to the code here -- like an ability to read from N files instead of 1, or N tables or something, do let me know so I can get on it.
