Re: No Intercept for Python

2014-06-18 Thread Naftali Harris
Thanks Reza! :-D

Naftali


On Wed, Jun 18, 2014 at 1:47 PM, Reza Zadeh  wrote:

> Hi Naftali,
>
> Yes you're right. For now please add a column of ones. We are working on
> adding a weighted regularization term, and exposing the scala intercept
> option in the python binding.
>
> Best,
> Reza
>
>
> On Mon, Jun 16, 2014 at 12:19 PM, Naftali Harris 
> wrote:
>
>> Hi everyone,
>>
>> The Python LogisticRegressionWithSGD does not appear to estimate an
>> intercept.  When I run the following, the returned weights and intercept
>> are both 0.0:
>>
>> from pyspark import SparkContext
>> from pyspark.mllib.regression import LabeledPoint
>> from pyspark.mllib.classification import LogisticRegressionWithSGD
>>
>> def main():
>> sc = SparkContext(appName="NoIntercept")
>>
>> train = sc.parallelize([LabeledPoint(0, [0]), LabeledPoint(1, [0]),
>> LabeledPoint(1, [0])])
>>
>> model = LogisticRegressionWithSGD.train(train, iterations=500,
>> step=0.1)
>> print "Final weights: " + str(model.weights)
>> print "Final intercept: " + str(model.intercept)
>>
>> if __name__ == "__main__":
>> main()
>>
>>
>> Of course, one can fit an intercept with the simple expedient of adding a
>> column of ones, but that's kind of annoying.  Moreover, it looks like the
>> scala version has an intercept option.
>>
>> Am I missing something? Should I just add the column of ones? If I
>> submitted a PR doing that, is that the sort of thing you guys would accept?
>>
>> Thanks! :-)
>>
>> Naftali
>>
>
>


Re: No Intercept for Python

2014-06-18 Thread Reza Zadeh
Hi Naftali,

Yes you're right. For now please add a column of ones. We are working on
adding a weighted regularization term, and exposing the scala intercept
option in the python binding.

Best,
Reza


On Mon, Jun 16, 2014 at 12:19 PM, Naftali Harris  wrote:

> Hi everyone,
>
> The Python LogisticRegressionWithSGD does not appear to estimate an
> intercept.  When I run the following, the returned weights and intercept
> are both 0.0:
>
> from pyspark import SparkContext
> from pyspark.mllib.regression import LabeledPoint
> from pyspark.mllib.classification import LogisticRegressionWithSGD
>
> def main():
> sc = SparkContext(appName="NoIntercept")
>
> train = sc.parallelize([LabeledPoint(0, [0]), LabeledPoint(1, [0]),
> LabeledPoint(1, [0])])
>
> model = LogisticRegressionWithSGD.train(train, iterations=500,
> step=0.1)
> print "Final weights: " + str(model.weights)
> print "Final intercept: " + str(model.intercept)
>
> if __name__ == "__main__":
> main()
>
>
> Of course, one can fit an intercept with the simple expedient of adding a
> column of ones, but that's kind of annoying.  Moreover, it looks like the
> scala version has an intercept option.
>
> Am I missing something? Should I just add the column of ones? If I
> submitted a PR doing that, is that the sort of thing you guys would accept?
>
> Thanks! :-)
>
> Naftali
>