In general, if you want real-time recommendations, you want the data in memory. Otherwise it's too slow. The JDBC-backed model works for, roughly, small problems up to a couple million ratings. Beyond that, stick it in memory. (And past about 100M ratings, you need to consider distributing the computation.)
FileDataModel is also in-memory since it uses GenericDataModel inside. It's built for fine-grained updates with its "delta files" support. On Wed, Oct 6, 2010 at 3:57 PM, James James <[email protected]>wrote: > Hi, > > I was evaluating which DataModel should be used when we are dealing with a > large > amount of user data with new data coming in on a regular basis (for example > on a > daily basis). The GenericModel is immutable, which requires the user data > to be > reloaded when new data comes in. I have not tried JDBCDataModel yet. Based > on > the posts here, it seems to me the reloading is not needed for > JDBCDataModel > since it is always kept up-to-date. > > > Do you think that JDBCDataModel is more efficient for my case? Are there > any > implementations of DataModel using HBase? > > Thanks, > > James > > > >
