No the test data can't be included in the training data, or else it would be like giving a student the answers to the exam before-hand.
You're doing much less work for other reasons. Recommendation is a bigger problem. It may require computing many estimated preferences to get one set of recommendations. Evaluation is just computing a single item preference each time. There is also an additional parameter "evaluation percentage" -- are you setting this to less than 1? That simply throws out some percentage of all data entirely. This is a way to make the evaluation quicker (and less accurate) by simply shrinking the problem.
