After you get your CF tuned by doing training/test splits, you then try it against your known good data. The known good data is a second kind of test data.
You assume that your input data and your "gold standard" data have the same statistical profile. If the performance against the sampled test data and the known good test data are different, then you might be comparing two different kinds of data. On Tue, Apr 3, 2012 at 9:07 PM, ziad kamel <[email protected]> wrote: > Hi ! I understand that reason behind splitting data to training and > test during classification and clustering , but why we need to do that > during CF ? We just select top X from list and and compare with good > recommendations . > Thanks -- Lance Norskog [email protected]
