On Wed, Sep 3, 2014 at 12:43 AM, Serega Sheypak <[email protected]> wrote:
> What are the right way to debug such situations? What can i read? There > were no changes to any system. > > First step is to dump the coccurrence counts for the items in question. The process from there is to find out what the problem is. It can be: 1) the LLR calculation went wrong (unlikely given the high usage of that code) 2) the counts are somehow very wrong. conceivable but unlikely. 3) the down-sampling causes some strange pathology. You should record counts before and after downsampling. This is very unlikely. 4) the data has some format or other transmission error. Moderately likely. 5) the system has somehow encouraged users to combine these seemingly unrelated items. This is most likely in my experience. If the case is (5), you can't fix it with math or code. The best answer is to simply have an exception list that says when you have *this* item, you must not show *that* item as a recommendation. Essentially, this is an edit on the indicator list.
