Hello.

I just started working on CF in MLlib.
I am using trainImplicit because I only have implicit ratings like page
views.

I am wondering which is a more appropriate form of ratings.
Let's assume that view count is regarded as a rating and
user 1 sees page 1 3 times and sees page 2 twice and so on.

In this case, I think ratings can be formatted like the following 2 cases.
(of course it is a RDD actually)

A:
user_id,page_id,rating(page view)
1,1,0.3
1,2,0.2
...

B:
user_id,page_id,rating(page view)
1,1,0.1
1,1,0.1
1,1,0.1
1,2,0.1
1,2,0.1
...

It is allowed to have like B ?
If it is, which is better ? ( is there any difference between them ?)

Best,
Hiro

Reply via email to