Hi, We are developing a system that issues recommendations in real-time based on data from a main data file (say, /tmp/data.lst) together with daily update files (/tmp/data.1.lst, /tmp/data.2.lst, etc.) We call refresh() on the SlopeOne recommender when the daily files are updated. We are concerned about the performance while the daily update files are being loaded, and are interested in any feedback on what to expect.
I've been looking through the Mahout code to determine whether Mahout can make recommendations while the (SlopeOne) recommender is being refreshed. From what I can tell, the call to refresh() ends up in MemoryDiffStorage.buildAverageDiffs(), where the system acquires a write lock. This would stall any calls to MemoryDiffStorage.getDiffs(), where the system acquires a read lock. So, it looks to me like the MemoryDiffStorage is taking a locking-based approach, rather than a fill-and-swap approach. On the other hand, FileDataModel has a reload() method with: delegate = buildModel() Which looks like a fill-and-swap based approach that would allow the system to seamlessly continue to serve recommendations even while the model is being refreshed. Is this correct? If so, should we be concerned about the locking of the MemoryDiffStorage? Are there any workarounds? Thanks in advance! Regards, Eric The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.
