Re: Negative values of predictions in ALS.tranform

2016-12-16 Thread Manish Tripathi
Thanks a bunch. That's very helpful.

On Friday, December 16, 2016, Sean Owen  wrote:

> That all looks correct.
>
> On Thu, Dec 15, 2016 at 11:54 PM Manish Tripathi  > wrote:
>
>> ok. Thanks. So here is what I understood.
>>
>> Input data to Als.fit(implicitPrefs=True) is the actual strengths (count
>> data). So if I have a matrix of (user,item,views/purchases) I pass that as
>> the input and not the binarized one (preference). This signifies the
>> strength.
>>
>> 2) Since we also pass the alpha parameter to this Als.fit() method, Spark
>> internally creates the confidence matrix +1+alpha*input_data or some other
>> alpha factor.
>>
>> 3). The output which it gives is basically a factorization of 0/1 matrix
>> (binarized matrix from initial input data), hence the output also resembles
>> the preference matrix (0/1) suggesting the interaction. So typically it
>> should be between 0-1but if it is negative it means very less
>> preference/interaction
>>
>> *Does all the above sound correct?.*
>>
>> If yes, then one last question-
>>
>> 1). *For explicit dataset where we don't use implicitPref=True,* the
>> predicted ratings would be actual ratings like it can be 2.3,4.5 etc and
>> not the interaction measure. That is because in explicit we are not using
>> the confidence matrix and preference matrix concept and use the actual
>> rating data. So any output from Spark ALS for explicit data would be a
>> rating prediction.
>> ᐧ
>>
>> On Thu, Dec 15, 2016 at 3:46 PM, Sean Owen > > wrote:
>>
>>> No, input are weights or strengths. The output is a factorization of the
>>> binarization of that to 0/1, not probs or a factorization of the input.
>>> This explains the range of the output.
>>>
>>>
>>> On Thu, Dec 15, 2016, 23:43 Manish Tripathi >> > wrote:
>>>
 when you say *implicit ALS *is* factoring the 0/1 matrix. , are you
 saying for implicit feedback algorithm we need to pass the input data as
 the preference matrix i.e a matrix of 0 and 1?. *

 Then how will they calculate the confidence matrix which is basically
 =1+alpha*count matrix. If we don't pass the actual count of values (views
 etc) then how does Spark calculates the confidence matrix?.

 I was of the understanding that input data for
 als.fit(implicitPref=True) is the actual count matrix of the
 views/purchases?. Am I going wrong here if yes, then how is Spark
 calculating the confidence matrix if it doesn't have the actual count data.

 The original paper on which Spark algo is based needs the actual count
 data to create a confidence matrix and also needs the 0/1 matrix since the
 objective functions uses both the confidence matrix and 0/1 matrix to find
 the user and item factors.
 ᐧ

 On Thu, Dec 15, 2016 at 3:38 PM, Sean Owen >>> > wrote:

> No, you can't interpret the output as probabilities at all. In
> particular they may be negative. It is not predicting rating but
> interaction. Negative means very strongly not predicted to interact. No,
> implicit ALS *is* factoring the 0/1 matrix.
>
> On Thu, Dec 15, 2016, 23:31 Manish Tripathi  > wrote:
>
>> Ok. So we can kind of interpret the output as probabilities even
>> though it is not modeling probabilities. This is to be able to use it for
>> binaryclassification evaluator.
>>
>> So the way I understand is and as per the algo, the predicted matrix
>> is basically a dot product of user factor and item factor matrix.
>>
>> but in what circumstances the ratings predicted can be negative. I
>> can understand if the individual user factor vector and item factor 
>> vector
>> is having negative factor terms, then it can be negative. But practically
>> does negative make any sense? AS per algorithm the dot product is the
>> predicted rating. So rating shouldnt be negative for it to make any 
>> sense.
>> Also rating just between 0-1 is normalised rating? Typically rating we
>> expect to be like any real value 2.3,4.5 etc.
>>
>> Also please note, for implicit feedback ALS, we don't feed 0/1
>> matrix. We feed the count matrix (discrete count values) and am assuming
>> spark internally converts it into a preference matrix (1/0) and a
>> confidence matrix =1+alpha*count_matrix
>>
>>
>>
>>
>> ᐧ
>>
>> On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen > > wrote:
>>
>>> No, ALS is not modeling probabilities. The outputs are
>>> reconstructions of a 0/1 matrix. Most values will be in [0,1], but, it's
>>> possible to get values outside that range.
>>>
>>> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi <
>>> [email protected]
>>> > wrote:
>>>
 Hi

 ran the ALS model for implicit feedback thing. Then I used the
 .transform method of the model to predict the ratings for the original
 dataset. My dataset is of the f

Re: Negative values of predictions in ALS.tranform

2016-12-16 Thread Sean Owen
That all looks correct.

On Thu, Dec 15, 2016 at 11:54 PM Manish Tripathi 
wrote:

> ok. Thanks. So here is what I understood.
>
> Input data to Als.fit(implicitPrefs=True) is the actual strengths (count
> data). So if I have a matrix of (user,item,views/purchases) I pass that as
> the input and not the binarized one (preference). This signifies the
> strength.
>
> 2) Since we also pass the alpha parameter to this Als.fit() method, Spark
> internally creates the confidence matrix +1+alpha*input_data or some other
> alpha factor.
>
> 3). The output which it gives is basically a factorization of 0/1 matrix
> (binarized matrix from initial input data), hence the output also resembles
> the preference matrix (0/1) suggesting the interaction. So typically it
> should be between 0-1but if it is negative it means very less
> preference/interaction
>
> *Does all the above sound correct?.*
>
> If yes, then one last question-
>
> 1). *For explicit dataset where we don't use implicitPref=True,* the
> predicted ratings would be actual ratings like it can be 2.3,4.5 etc and
> not the interaction measure. That is because in explicit we are not using
> the confidence matrix and preference matrix concept and use the actual
> rating data. So any output from Spark ALS for explicit data would be a
> rating prediction.
> ᐧ
>
> On Thu, Dec 15, 2016 at 3:46 PM, Sean Owen  wrote:
>
> No, input are weights or strengths. The output is a factorization of the
> binarization of that to 0/1, not probs or a factorization of the input.
> This explains the range of the output.
>
>
> On Thu, Dec 15, 2016, 23:43 Manish Tripathi  wrote:
>
> when you say *implicit ALS *is* factoring the 0/1 matrix. , are you
> saying for implicit feedback algorithm we need to pass the input data as
> the preference matrix i.e a matrix of 0 and 1?. *
>
> Then how will they calculate the confidence matrix which is basically
> =1+alpha*count matrix. If we don't pass the actual count of values (views
> etc) then how does Spark calculates the confidence matrix?.
>
> I was of the understanding that input data for als.fit(implicitPref=True)
> is the actual count matrix of the views/purchases?. Am I going wrong here
> if yes, then how is Spark calculating the confidence matrix if it doesn't
> have the actual count data.
>
> The original paper on which Spark algo is based needs the actual count
> data to create a confidence matrix and also needs the 0/1 matrix since the
> objective functions uses both the confidence matrix and 0/1 matrix to find
> the user and item factors.
> ᐧ
>
> On Thu, Dec 15, 2016 at 3:38 PM, Sean Owen  wrote:
>
> No, you can't interpret the output as probabilities at all. In particular
> they may be negative. It is not predicting rating but interaction. Negative
> means very strongly not predicted to interact. No, implicit ALS *is*
> factoring the 0/1 matrix.
>
> On Thu, Dec 15, 2016, 23:31 Manish Tripathi  wrote:
>
> Ok. So we can kind of interpret the output as probabilities even though it
> is not modeling probabilities. This is to be able to use it for
> binaryclassification evaluator.
>
> So the way I understand is and as per the algo, the predicted matrix is
> basically a dot product of user factor and item factor matrix.
>
> but in what circumstances the ratings predicted can be negative. I can
> understand if the individual user factor vector and item factor vector is
> having negative factor terms, then it can be negative. But practically does
> negative make any sense? AS per algorithm the dot product is the predicted
> rating. So rating shouldnt be negative for it to make any sense. Also
> rating just between 0-1 is normalised rating? Typically rating we expect to
> be like any real value 2.3,4.5 etc.
>
> Also please note, for implicit feedback ALS, we don't feed 0/1 matrix. We
> feed the count matrix (discrete count values) and am assuming spark
> internally converts it into a preference matrix (1/0) and a confidence
> matrix =1+alpha*count_matrix
>
>
>
>
> ᐧ
>
> On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen  wrote:
>
> No, ALS is not modeling probabilities. The outputs are reconstructions of
> a 0/1 matrix. Most values will be in [0,1], but, it's possible to get
> values outside that range.
>
> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi 
> wrote:
>
> Hi
>
> ran the ALS model for implicit feedback thing. Then I used the .transform
> method of the model to predict the ratings for the original dataset. My
> dataset is of the form (user,item,rating)
>
> I see something like below:
>
> predictions.show(5,truncate=False)
>
>
> Why is the last prediction value negative ?. Isn't the transform method
> giving the prediction(probability) of seeing the rating as 1?. I had counts
> data for rating (implicit feedback) and for validation dataset I binarized
> the rating (1 if >0 else 0). My training data has rating positive (it's
> basically the count of views to a video).
>
> I used following to train:
>
> * als = ALS(rank=x, maxIter=15, reg

Re: Negative values of predictions in ALS.tranform

2016-12-15 Thread Manish Tripathi
ok. Thanks. So here is what I understood.

Input data to Als.fit(implicitPrefs=True) is the actual strengths (count
data). So if I have a matrix of (user,item,views/purchases) I pass that as
the input and not the binarized one (preference). This signifies the
strength.

2) Since we also pass the alpha parameter to this Als.fit() method, Spark
internally creates the confidence matrix +1+alpha*input_data or some other
alpha factor.

3). The output which it gives is basically a factorization of 0/1 matrix
(binarized matrix from initial input data), hence the output also resembles
the preference matrix (0/1) suggesting the interaction. So typically it
should be between 0-1but if it is negative it means very less
preference/interaction

*Does all the above sound correct?.*

If yes, then one last question-

1). *For explicit dataset where we don't use implicitPref=True,* the
predicted ratings would be actual ratings like it can be 2.3,4.5 etc and
not the interaction measure. That is because in explicit we are not using
the confidence matrix and preference matrix concept and use the actual
rating data. So any output from Spark ALS for explicit data would be a
rating prediction.
ᐧ

On Thu, Dec 15, 2016 at 3:46 PM, Sean Owen  wrote:

> No, input are weights or strengths. The output is a factorization of the
> binarization of that to 0/1, not probs or a factorization of the input.
> This explains the range of the output.
>
>
> On Thu, Dec 15, 2016, 23:43 Manish Tripathi  wrote:
>
>> when you say *implicit ALS *is* factoring the 0/1 matrix. , are you
>> saying for implicit feedback algorithm we need to pass the input data as
>> the preference matrix i.e a matrix of 0 and 1?. *
>>
>> Then how will they calculate the confidence matrix which is basically
>> =1+alpha*count matrix. If we don't pass the actual count of values (views
>> etc) then how does Spark calculates the confidence matrix?.
>>
>> I was of the understanding that input data for als.fit(implicitPref=True)
>> is the actual count matrix of the views/purchases?. Am I going wrong here
>> if yes, then how is Spark calculating the confidence matrix if it doesn't
>> have the actual count data.
>>
>> The original paper on which Spark algo is based needs the actual count
>> data to create a confidence matrix and also needs the 0/1 matrix since the
>> objective functions uses both the confidence matrix and 0/1 matrix to find
>> the user and item factors.
>> ᐧ
>>
>> On Thu, Dec 15, 2016 at 3:38 PM, Sean Owen  wrote:
>>
>> No, you can't interpret the output as probabilities at all. In particular
>> they may be negative. It is not predicting rating but interaction. Negative
>> means very strongly not predicted to interact. No, implicit ALS *is*
>> factoring the 0/1 matrix.
>>
>> On Thu, Dec 15, 2016, 23:31 Manish Tripathi  wrote:
>>
>> Ok. So we can kind of interpret the output as probabilities even though
>> it is not modeling probabilities. This is to be able to use it for
>> binaryclassification evaluator.
>>
>> So the way I understand is and as per the algo, the predicted matrix is
>> basically a dot product of user factor and item factor matrix.
>>
>> but in what circumstances the ratings predicted can be negative. I can
>> understand if the individual user factor vector and item factor vector is
>> having negative factor terms, then it can be negative. But practically does
>> negative make any sense? AS per algorithm the dot product is the predicted
>> rating. So rating shouldnt be negative for it to make any sense. Also
>> rating just between 0-1 is normalised rating? Typically rating we expect to
>> be like any real value 2.3,4.5 etc.
>>
>> Also please note, for implicit feedback ALS, we don't feed 0/1 matrix. We
>> feed the count matrix (discrete count values) and am assuming spark
>> internally converts it into a preference matrix (1/0) and a confidence
>> matrix =1+alpha*count_matrix
>>
>>
>>
>>
>> ᐧ
>>
>> On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen  wrote:
>>
>> No, ALS is not modeling probabilities. The outputs are reconstructions of
>> a 0/1 matrix. Most values will be in [0,1], but, it's possible to get
>> values outside that range.
>>
>> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi 
>> wrote:
>>
>> Hi
>>
>> ran the ALS model for implicit feedback thing. Then I used the .transform
>> method of the model to predict the ratings for the original dataset. My
>> dataset is of the form (user,item,rating)
>>
>> I see something like below:
>>
>> predictions.show(5,truncate=False)
>>
>>
>> Why is the last prediction value negative ?. Isn't the transform method
>> giving the prediction(probability) of seeing the rating as 1?. I had counts
>> data for rating (implicit feedback) and for validation dataset I binarized
>> the rating (1 if >0 else 0). My training data has rating positive (it's
>> basically the count of views to a video).
>>
>> I used following to train:
>>
>> * als = ALS(rank=x, maxIter=15, regParam=y,
>> implicitPrefs=True,alpha=40.0)*
>>
>> *

Re: Negative values of predictions in ALS.tranform

2016-12-15 Thread Sean Owen
No, input are weights or strengths. The output is a factorization of the
binarization of that to 0/1, not probs or a factorization of the input.
This explains the range of the output.

On Thu, Dec 15, 2016, 23:43 Manish Tripathi  wrote:

> when you say *implicit ALS *is* factoring the 0/1 matrix. , are you
> saying for implicit feedback algorithm we need to pass the input data as
> the preference matrix i.e a matrix of 0 and 1?. *
>
> Then how will they calculate the confidence matrix which is basically
> =1+alpha*count matrix. If we don't pass the actual count of values (views
> etc) then how does Spark calculates the confidence matrix?.
>
> I was of the understanding that input data for als.fit(implicitPref=True)
> is the actual count matrix of the views/purchases?. Am I going wrong here
> if yes, then how is Spark calculating the confidence matrix if it doesn't
> have the actual count data.
>
> The original paper on which Spark algo is based needs the actual count
> data to create a confidence matrix and also needs the 0/1 matrix since the
> objective functions uses both the confidence matrix and 0/1 matrix to find
> the user and item factors.
> ᐧ
>
> On Thu, Dec 15, 2016 at 3:38 PM, Sean Owen  wrote:
>
> No, you can't interpret the output as probabilities at all. In particular
> they may be negative. It is not predicting rating but interaction. Negative
> means very strongly not predicted to interact. No, implicit ALS *is*
> factoring the 0/1 matrix.
>
> On Thu, Dec 15, 2016, 23:31 Manish Tripathi  wrote:
>
> Ok. So we can kind of interpret the output as probabilities even though it
> is not modeling probabilities. This is to be able to use it for
> binaryclassification evaluator.
>
> So the way I understand is and as per the algo, the predicted matrix is
> basically a dot product of user factor and item factor matrix.
>
> but in what circumstances the ratings predicted can be negative. I can
> understand if the individual user factor vector and item factor vector is
> having negative factor terms, then it can be negative. But practically does
> negative make any sense? AS per algorithm the dot product is the predicted
> rating. So rating shouldnt be negative for it to make any sense. Also
> rating just between 0-1 is normalised rating? Typically rating we expect to
> be like any real value 2.3,4.5 etc.
>
> Also please note, for implicit feedback ALS, we don't feed 0/1 matrix. We
> feed the count matrix (discrete count values) and am assuming spark
> internally converts it into a preference matrix (1/0) and a confidence
> matrix =1+alpha*count_matrix
>
>
>
>
> ᐧ
>
> On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen  wrote:
>
> No, ALS is not modeling probabilities. The outputs are reconstructions of
> a 0/1 matrix. Most values will be in [0,1], but, it's possible to get
> values outside that range.
>
> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi 
> wrote:
>
> Hi
>
> ran the ALS model for implicit feedback thing. Then I used the .transform
> method of the model to predict the ratings for the original dataset. My
> dataset is of the form (user,item,rating)
>
> I see something like below:
>
> predictions.show(5,truncate=False)
>
>
> Why is the last prediction value negative ?. Isn't the transform method
> giving the prediction(probability) of seeing the rating as 1?. I had counts
> data for rating (implicit feedback) and for validation dataset I binarized
> the rating (1 if >0 else 0). My training data has rating positive (it's
> basically the count of views to a video).
>
> I used following to train:
>
> * als = ALS(rank=x, maxIter=15, regParam=y, implicitPrefs=True,alpha=40.0)*
>
> *model=als.fit(self.train)*
>
> What does negative prediction mean here and is it ok to have that?
> ᐧ
>
>
>
>


Re: Negative values of predictions in ALS.tranform

2016-12-15 Thread Manish Tripathi
when you say *implicit ALS *is* factoring the 0/1 matrix. , are you saying
for implicit feedback algorithm we need to pass the input data as the
preference matrix i.e a matrix of 0 and 1?. *

Then how will they calculate the confidence matrix which is basically
=1+alpha*count matrix. If we don't pass the actual count of values (views
etc) then how does Spark calculates the confidence matrix?.

I was of the understanding that input data for als.fit(implicitPref=True)
is the actual count matrix of the views/purchases?. Am I going wrong here
if yes, then how is Spark calculating the confidence matrix if it doesn't
have the actual count data.

The original paper on which Spark algo is based needs the actual count data
to create a confidence matrix and also needs the 0/1 matrix since the
objective functions uses both the confidence matrix and 0/1 matrix to find
the user and item factors.
ᐧ

On Thu, Dec 15, 2016 at 3:38 PM, Sean Owen  wrote:

> No, you can't interpret the output as probabilities at all. In particular
> they may be negative. It is not predicting rating but interaction. Negative
> means very strongly not predicted to interact. No, implicit ALS *is*
> factoring the 0/1 matrix.
>
> On Thu, Dec 15, 2016, 23:31 Manish Tripathi  wrote:
>
>> Ok. So we can kind of interpret the output as probabilities even though
>> it is not modeling probabilities. This is to be able to use it for
>> binaryclassification evaluator.
>>
>> So the way I understand is and as per the algo, the predicted matrix is
>> basically a dot product of user factor and item factor matrix.
>>
>> but in what circumstances the ratings predicted can be negative. I can
>> understand if the individual user factor vector and item factor vector is
>> having negative factor terms, then it can be negative. But practically does
>> negative make any sense? AS per algorithm the dot product is the predicted
>> rating. So rating shouldnt be negative for it to make any sense. Also
>> rating just between 0-1 is normalised rating? Typically rating we expect to
>> be like any real value 2.3,4.5 etc.
>>
>> Also please note, for implicit feedback ALS, we don't feed 0/1 matrix. We
>> feed the count matrix (discrete count values) and am assuming spark
>> internally converts it into a preference matrix (1/0) and a confidence
>> matrix =1+alpha*count_matrix
>>
>>
>>
>>
>> ᐧ
>>
>> On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen  wrote:
>>
>> No, ALS is not modeling probabilities. The outputs are reconstructions of
>> a 0/1 matrix. Most values will be in [0,1], but, it's possible to get
>> values outside that range.
>>
>> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi 
>> wrote:
>>
>> Hi
>>
>> ran the ALS model for implicit feedback thing. Then I used the .transform
>> method of the model to predict the ratings for the original dataset. My
>> dataset is of the form (user,item,rating)
>>
>> I see something like below:
>>
>> predictions.show(5,truncate=False)
>>
>>
>> Why is the last prediction value negative ?. Isn't the transform method
>> giving the prediction(probability) of seeing the rating as 1?. I had counts
>> data for rating (implicit feedback) and for validation dataset I binarized
>> the rating (1 if >0 else 0). My training data has rating positive (it's
>> basically the count of views to a video).
>>
>> I used following to train:
>>
>> * als = ALS(rank=x, maxIter=15, regParam=y,
>> implicitPrefs=True,alpha=40.0)*
>>
>> *model=als.fit(self.train)*
>>
>> What does negative prediction mean here and is it ok to have that?
>> ᐧ
>>
>>
>>


Re: Negative values of predictions in ALS.tranform

2016-12-15 Thread Sean Owen
No, you can't interpret the output as probabilities at all. In particular
they may be negative. It is not predicting rating but interaction. Negative
means very strongly not predicted to interact. No, implicit ALS *is*
factoring the 0/1 matrix.

On Thu, Dec 15, 2016, 23:31 Manish Tripathi  wrote:

> Ok. So we can kind of interpret the output as probabilities even though it
> is not modeling probabilities. This is to be able to use it for
> binaryclassification evaluator.
>
> So the way I understand is and as per the algo, the predicted matrix is
> basically a dot product of user factor and item factor matrix.
>
> but in what circumstances the ratings predicted can be negative. I can
> understand if the individual user factor vector and item factor vector is
> having negative factor terms, then it can be negative. But practically does
> negative make any sense? AS per algorithm the dot product is the predicted
> rating. So rating shouldnt be negative for it to make any sense. Also
> rating just between 0-1 is normalised rating? Typically rating we expect to
> be like any real value 2.3,4.5 etc.
>
> Also please note, for implicit feedback ALS, we don't feed 0/1 matrix. We
> feed the count matrix (discrete count values) and am assuming spark
> internally converts it into a preference matrix (1/0) and a confidence
> matrix =1+alpha*count_matrix
>
>
>
>
> ᐧ
>
> On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen  wrote:
>
> No, ALS is not modeling probabilities. The outputs are reconstructions of
> a 0/1 matrix. Most values will be in [0,1], but, it's possible to get
> values outside that range.
>
> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi 
> wrote:
>
> Hi
>
> ran the ALS model for implicit feedback thing. Then I used the .transform
> method of the model to predict the ratings for the original dataset. My
> dataset is of the form (user,item,rating)
>
> I see something like below:
>
> predictions.show(5,truncate=False)
>
>
> Why is the last prediction value negative ?. Isn't the transform method
> giving the prediction(probability) of seeing the rating as 1?. I had counts
> data for rating (implicit feedback) and for validation dataset I binarized
> the rating (1 if >0 else 0). My training data has rating positive (it's
> basically the count of views to a video).
>
> I used following to train:
>
> * als = ALS(rank=x, maxIter=15, regParam=y, implicitPrefs=True,alpha=40.0)*
>
> *model=als.fit(self.train)*
>
> What does negative prediction mean here and is it ok to have that?
> ᐧ
>
>
>


Re: Negative values of predictions in ALS.tranform

2016-12-15 Thread Manish Tripathi
Ok. So we can kind of interpret the output as probabilities even though it
is not modeling probabilities. This is to be able to use it for
binaryclassification evaluator.

So the way I understand is and as per the algo, the predicted matrix is
basically a dot product of user factor and item factor matrix.

but in what circumstances the ratings predicted can be negative. I can
understand if the individual user factor vector and item factor vector is
having negative factor terms, then it can be negative. But practically does
negative make any sense? AS per algorithm the dot product is the predicted
rating. So rating shouldnt be negative for it to make any sense. Also
rating just between 0-1 is normalised rating? Typically rating we expect to
be like any real value 2.3,4.5 etc.

Also please note, for implicit feedback ALS, we don't feed 0/1 matrix. We
feed the count matrix (discrete count values) and am assuming spark
internally converts it into a preference matrix (1/0) and a confidence
matrix =1+alpha*count_matrix




ᐧ

On Thu, Dec 15, 2016 at 2:56 PM, Sean Owen  wrote:

> No, ALS is not modeling probabilities. The outputs are reconstructions of
> a 0/1 matrix. Most values will be in [0,1], but, it's possible to get
> values outside that range.
>
> On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi 
> wrote:
>
>> Hi
>>
>> ran the ALS model for implicit feedback thing. Then I used the .transform
>> method of the model to predict the ratings for the original dataset. My
>> dataset is of the form (user,item,rating)
>>
>> I see something like below:
>>
>> predictions.show(5,truncate=False)
>>
>>
>> Why is the last prediction value negative ?. Isn't the transform method
>> giving the prediction(probability) of seeing the rating as 1?. I had counts
>> data for rating (implicit feedback) and for validation dataset I binarized
>> the rating (1 if >0 else 0). My training data has rating positive (it's
>> basically the count of views to a video).
>>
>> I used following to train:
>>
>> * als = ALS(rank=x, maxIter=15, regParam=y,
>> implicitPrefs=True,alpha=40.0)*
>>
>> *model=als.fit(self.train)*
>>
>> What does negative prediction mean here and is it ok to have that?
>> ᐧ
>>
>


Re: Negative values of predictions in ALS.tranform

2016-12-15 Thread Sean Owen
No, ALS is not modeling probabilities. The outputs are reconstructions of a
0/1 matrix. Most values will be in [0,1], but, it's possible to get values
outside that range.

On Thu, Dec 15, 2016 at 10:21 PM Manish Tripathi 
wrote:

> Hi
>
> ran the ALS model for implicit feedback thing. Then I used the .transform
> method of the model to predict the ratings for the original dataset. My
> dataset is of the form (user,item,rating)
>
> I see something like below:
>
> predictions.show(5,truncate=False)
>
>
> Why is the last prediction value negative ?. Isn't the transform method
> giving the prediction(probability) of seeing the rating as 1?. I had counts
> data for rating (implicit feedback) and for validation dataset I binarized
> the rating (1 if >0 else 0). My training data has rating positive (it's
> basically the count of views to a video).
>
> I used following to train:
>
> * als = ALS(rank=x, maxIter=15, regParam=y, implicitPrefs=True,alpha=40.0)*
>
> *model=als.fit(self.train)*
>
> What does negative prediction mean here and is it ok to have that?
> ᐧ
>