I believe the ALS algo expects the ratings to be aggregated (A). I don't
see why you have to use decimals for rating.

Regards
Sab

On Thu, Feb 25, 2016 at 4:50 PM, Hiroyuki Yamada <mogwa...@gmail.com> wrote:

> Hello.
>
> I just started working on CF in MLlib.
> I am using trainImplicit because I only have implicit ratings like page
> views.
>
> I am wondering which is a more appropriate form of ratings.
> Let's assume that view count is regarded as a rating and
> user 1 sees page 1 3 times and sees page 2 twice and so on.
>
> In this case, I think ratings can be formatted like the following 2 cases.
> (of course it is a RDD actually)
>
> A:
> user_id,page_id,rating(page view)
> 1,1,0.3
> 1,2,0.2
> ...
>
> B:
> user_id,page_id,rating(page view)
> 1,1,0.1
> 1,1,0.1
> 1,1,0.1
> 1,2,0.1
> 1,2,0.1
> ...
>
> It is allowed to have like B ?
> If it is, which is better ? ( is there any difference between them ?)
>
> Best,
> Hiro
>
>
>
>

Reply via email to