Are you sure this is an apples-to-apples comparison? for example does your
SAS process normalize or otherwise transform the data first?

Is the optimization configured similarly in both cases -- same
regularization, etc.?

Are you sure you are pulling out the intercept correctly? It is a separate
value from the logistic regression model in Spark.

On Thu, Dec 18, 2014 at 4:34 PM, Franco Barrientos <
franco.barrien...@exalitica.com> wrote:
>
> Hi all!,
>
>
>
> I have a problem with LogisticRegressionWithSGD, when I train a data set
> with one variable (wich is a amount of an item) and intercept, I get
> weights of
>
> (-0.4021,-207.1749) for both features, respectively. This don´t make sense
> to me because I run a logistic regression for the same data in SAS and I
> get these weights (-2.6604,0.000245).
>
>
>
> The rank of this variable is from 0 to 59102 with a mean of 1158.
>
>
>
> The problem is when I want to calculate the probabilities for each user
> from data set, this probability is near to zero or zero in much cases,
> because when spark calculates exp(-1*(-0.4021+(-207.1749)*amount)) this is
> a big number, in fact infinity for spark.
>
>
>
> How can I treat this variable? or why this happened?
>
>
>
> Thanks ,
>
>
>
> *Franco Barrientos*
> Data Scientist
>
> Málaga #115, Of. 1003, Las Condes.
> Santiago, Chile.
> (+562)-29699649
> (+569)-76347893
>
> franco.barrien...@exalitica.com
>
> www.exalitica.com
>
> [image: http://exalitica.com/web/img/frim.png]
>
>
>

Reply via email to