Hi, thanks for the response. 1. Where can I read about how does LLR works in mahout? I'm not math-person, so java code gives no intuition.
2. What is "down-sampling" in the context of a problem? I've found the translation: ~ "reduce the size of group you choose". I do not change someting. It worked pretty nice for a week then started to show strage items in top. 3. transmission error. Can't find a place for this. Guys don't keep history for item dictionary. I keep a snapshot of it for each calculation day. I've checked suspicious ids of items - no changes at all. 4. the counts are somehow very wrong. conceivable but unlikely. I've lost the context. The count of what? 5. somehow encouraged users to combine these seemingly unrelated items Checking the cooccurence... 2014-09-03 12:06 GMT+04:00 Ted Dunning <[email protected]>: > On Wed, Sep 3, 2014 at 12:43 AM, Serega Sheypak <[email protected]> > wrote: > > > What are the right way to debug such situations? What can i read? There > > were no changes to any system. > > > > > First step is to dump the coccurrence counts for the items in question. > > The process from there is to find out what the problem is. It can be: > > 1) the LLR calculation went wrong (unlikely given the high usage of that > code) > > 2) the counts are somehow very wrong. conceivable but unlikely. > > 3) the down-sampling causes some strange pathology. You should record > counts before and after downsampling. This is very unlikely. > > 4) the data has some format or other transmission error. Moderately > likely. > > 5) the system has somehow encouraged users to combine these seemingly > unrelated items. This is most likely in my experience. > > If the case is (5), you can't fix it with math or code. The best answer is > to simply have an exception list that says when you have *this* item, you > must not show *that* item as a recommendation. Essentially, this is an > edit on the indicator list. >
