RE: Effects problems in logistic regression

Franco Barrientos Thu, 18 Dec 2014 10:51:58 -0800

Yes, without the “amounts” variables the results are similiar. When I put other 
variables its fine.


 

De: Sean Owen [mailto:so...@cloudera.com] 
Enviado el: jueves, 18 de diciembre de 2014 14:22
Para: Franco Barrientos
CC: user@spark.apache.org
Asunto: Re: Effects problems in logistic regression

 

Are you sure this is an apples-to-apples comparison? for example does your SAS 
process normalize or otherwise transform the data first? 

 

Is the optimization configured similarly in both cases -- same regularization, 
etc.?

 

Are you sure you are pulling out the intercept correctly? It is a separate 
value from the logistic regression model in Spark.

 

On Thu, Dec 18, 2014 at 4:34 PM, Franco Barrientos 
<franco.barrien...@exalitica.com <mailto:franco.barrien...@exalitica.com> > 
wrote:

Hi all!,

 

I have a problem with LogisticRegressionWithSGD, when I train a data set with 
one variable (wich is a amount of an item) and intercept, I get weights of

(-0.4021,-207.1749) for both features, respectively. This don´t make sense to 
me because I run a logistic regression for the same data in SAS and I get these 
weights (-2.6604,0.000245).

 

The rank of this variable is from 0 to 59102 with a mean of 1158.

 

The problem is when I want to calculate the probabilities for each user from 
data set, this probability is near to zero or zero in much cases, because when 
spark calculates exp(-1*(-0.4021+(-207.1749)*amount)) this is a big number, in 
fact infinity for spark.

 

How can I treat this variable? or why this happened? 

 

Thanks ,

 

Franco Barrientos
Data Scientist

Málaga #115, Of. 1003, Las Condes.
Santiago, Chile.
(+562)-29699649 <tel:%28%2B562%29-29699649> 
(+569)-76347893 <tel:%28%2B569%29-76347893> 

franco.barrien...@exalitica.com <mailto:franco.barrien...@exalitica.com>  

www.exalitica.com <http://www.exalitica.com/> 


  <http://exalitica.com/web/img/frim.png>

RE: Effects problems in logistic regression

Reply via email to