This is expected behavior as far as I understand the algorithm. I
don't see how a user-based recommender can estimated a preference by X
for Y if nobody who rated Y is connected to X at all.

You can use a PreferenceInferrer to fill in a lot of missing data, but
I don't really recommend this for more than experimentation.

The issue here is mostly that the user-item matrix is too sparse. And
yes there are load of follow-up suggestions that tackle that,
depending on your data, as alex hinted at.

On Mon, Aug 9, 2010 at 3:31 AM, Yanir Seroussi <[email protected]> wrote:
> Hi,
>
> The first example here (
> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation)
> shows how to create a GenericUserBasedRecommender with a
> NearestNUserNeighborhood. My problem/question is that setting n to any small
> number seems to limit the coverage of the recommender, because the nearest n
> users are calculated without taking the target item into account.
> For example, given a user X and n = 10, if we want to estimatePreference()
> for an item Y, if this item is not rated by any user in the neighbourhood,
> the prediction will be NaN. I don't think that this is what one would expect
> from a user-based nearest-neighbour recommender, as Herlocker et al. (1999),
> who are cited in the example page above, didn't mention any change in
> coverage based on the number of nearest neighbours.
> Am I doing something wrong, or is this the way it should be? I have a
> feeling it is not the way it should be, because then using small
> neighbourhood sizes makes no sense as it severely restricts the ability of
> the recommender to estimate preferences.
>
> Please note that I observed this behaviour in version 0.3, but it seems to
> be the same in the latest version.
>
> Cheers,
> Yanir
>

Reply via email to