I believe that's right, and is what I was getting at. yes the implicit
formulation ends up implicitly including every possible interaction in
its loss function, even unobserved ones. That could be the difference.

This is mostly an academic question though. In practice, you have
click-like data and should be using the implicit version for sure.

However you can give negative implicit feedback to the model. You
could consider no-click as a mild, observed, negative interaction.
That is: supply a small negative value for these cases. Unobserved
pairs are not part of the data set. I'd be careful about assuming the
lack of an action carries signal.

On Thu, Feb 26, 2015 at 3:07 PM, 163 <lisend...@163.com> wrote:
> oh my god, I think I understood...
> In my case, there are three kinds of user-item pairs:
>
> Display and click pair(positive pair)
> Display but no-click pair(negative pair)
> No-display pair(unobserved pair)
>
> Explicit ALS only consider the first and the second kinds
> But implicit ALS consider all the three kinds of pair(and consider the third
> kind as the second pair, because their preference value are all zero and
> confidence are all 1)
>
> So the result are different. right?
>
> Could you please give me some advice, which ALS should I use?
> If I use the implicit ALS, how to distinguish the second and the third kind
> of pair:)
>
> My opinion is in my case, I should use explicit ALS ...
>
> Thank you so much
>
> 在 2015年2月26日,22:41,Xiangrui Meng <m...@databricks.com> 写道:
>
> Lisen, did you use all m-by-n pairs during training? Implicit model
> penalizes unobserved ratings, while explicit model doesn't. -Xiangrui
>
> On Feb 26, 2015 6:26 AM, "Sean Owen" <so...@cloudera.com> wrote:
>>
>> +user
>>
>> On Thu, Feb 26, 2015 at 2:26 PM, Sean Owen <so...@cloudera.com> wrote:
>>>
>>> I think I may have it backwards, and that you are correct to keep the 0
>>> elements in train() in order to try to reproduce the same result.
>>>
>>> The second formulation is called 'weighted regularization' and is used
>>> for both implicit and explicit feedback, as far as I can see in the code.
>>>
>>> Hm, I'm actually not clear why these would produce different results.
>>> Different code paths are used to be sure, but I'm not yet sure why they
>>> would give different results.
>>>
>>> In general you wouldn't use train() for data like this though, and would
>>> never set alpha=0.
>>>
>>> On Thu, Feb 26, 2015 at 2:15 PM, lisendong <lisend...@163.com> wrote:
>>>>
>>>> I want to confirm the loss function you use (sorry I’m not so familiar
>>>> with scala code so I did not understand the source code of mllib)
>>>>
>>>> According to the papers :
>>>>
>>>>
>>>> in your implicit feedback ALS, the loss function is (ICDM 2008):
>>>>
>>>> in the explicit feedback ALS, the loss function is (Netflix 2008):
>>>>
>>>> note that besides the difference of confidence parameter Cui, the
>>>> regularization is also different.  does your code also has this difference?
>>>>
>>>> Best Regards,
>>>> Sendong Li
>>>>
>>>>
>>>>> 在 2015年2月26日,下午9:42,lisendong <lisend...@163.com> 写道:
>>>>>
>>>>> Hi meng, fotero, sowen:
>>>>>
>>>>> I’m using ALS with spark 1.0.0, the code should be:
>>>>>
>>>>> https://github.com/apache/spark/blob/branch-1.0/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala
>>>>>
>>>>> I think the following two method should produce the same (or near)
>>>>> result:
>>>>>
>>>>> MatrixFactorizationModel model = ALS.train(ratings.rdd(), 30, 30, 0.01,
>>>>> -1, 1);
>>>>>
>>>>> MatrixFactorizationModel model = ALS.trainImplicit(ratings.rdd(), 30,
>>>>> 30, 0.01, -1, 0, 1);
>>>>>
>>>>> the data I used is display log, the format of log is as following:
>>>>>
>>>>> user  item  if-click
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I use 1.0 as score for click pair, and 0 as score for non-click pair.
>>>>>
>>>>>  in the second method, the alpha is set to zero, so the confidence for
>>>>> positive and negative are both 1.0 (right?)
>>>>>
>>>>> I think the two method should produce similar result, but the result is
>>>>> :  the second method’s result is very bad (the AUC of the first result is
>>>>> 0.7, but the AUC of the second result is only 0.61)
>>>>>
>>>>>
>>>>> I could not understand why, could you help me?
>>>>>
>>>>>
>>>>> Thank you very much!
>>>>>
>>>>> Best Regards,
>>>>> Sendong Li
>>>>
>>>>
>>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to