Thanks for the info.

How do I use StandardScaler() to scale example data  (10246.0,[14111.0,1.0]) ?

Thx
tri

-----Original Message-----
From: dbt...@dbtsai.com [mailto:dbt...@dbtsai.com] 
Sent: Friday, December 12, 2014 1:26 PM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Do I need to applied feature scaling via StandardScaler for LBFGS 
for Linear Regression?

It seems that your response is not scaled which will cause issue in LBFGS. 
Typically, people train Linear Regression with zero-mean/unit-variable feature 
and response without training the intercept. Since the response is zero-mean, 
the intercept will be always zero. When you convert the coefficients to the 
oringal space from the scaled space, the intercept can be computed by w0 = y - 
\sum <x_n> w_n where <x_n> is the average of column n.

Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Fri, Dec 12, 2014 at 10:49 AM, Bui, Tri <tri....@verizonwireless.com> wrote:
> Thanks for the confirmation.
>
> Fyi..The code below works for similar dataset, but with the feature magnitude 
> changed,  LBFGS converged to the right weights.
>
> Example, time sequential Feature value 1, 2, 3, 4, 5, would generate the 
> error while sequential feature 14111, 14112, 14113,14115 would converge to  
> the right weight.  Why?
>
> Below is code to implement standardscaler() for sample data  
> (10246.0,[14111.0,1.0])):
>
> val scaler1 = new StandardScaler().fit(train.map(x => x.features)) val 
> train1 = train.map(x => (x.label, scaler1.transform(x.features)))
>
> But I  keeps on getting error: "value features is not a member of (Double, 
> org.apache.spark.mllib.linalg.Vector)"
>
> Should my feature vector be .toInt instead of Double?
>
> Also, the error  org.apache.spark.mllib.linalg.Vector  should have an 
> "s" to match import library org.apache.spark.mllib.linalg.Vectors
>
> Thanks
> Tri
>
>
>
>
>
> -----Original Message-----
> From: dbt...@dbtsai.com [mailto:dbt...@dbtsai.com]
> Sent: Friday, December 12, 2014 12:16 PM
> To: Bui, Tri
> Cc: user@spark.apache.org
> Subject: Re: Do I need to applied feature scaling via StandardScaler for 
> LBFGS for Linear Regression?
>
> You need to do the StandardScaler to help the convergency yourself.
> LBFGS just takes whatever objective function you provide without doing any 
> scaling. I will like to provide LinearRegressionWithLBFGS which does the 
> scaling internally in the nearly feature.
>
> Sincerely,
>
> DB Tsai
> -------------------------------------------------------
> My Blog: https://www.dbtsai.com
> LinkedIn: https://www.linkedin.com/in/dbtsai
>
>
> On Fri, Dec 12, 2014 at 8:49 AM, Bui, Tri 
> <tri....@verizonwireless.com.invalid> wrote:
>> Hi,
>>
>>
>>
>> Trying to use LBFGS as the optimizer, do I need to implement feature 
>> scaling via StandardScaler or does LBFGS do it by default?
>>
>>
>>
>> Following code  generated error “ Failure again!  Giving up and 
>> returning, Maybe the objective is just poorly behaved ?”.
>>
>>
>>
>> val data = sc.textFile("file:///data/Train/final2.train")
>>
>> val parsedata = data.map { line =>
>>
>> val partsdata = line.split(',')
>>
>> LabeledPoint(partsdata(0).toDouble, Vectors.dense(partsdata(1).split('
>> ').map(_.toDouble)))
>>
>> }
>>
>>
>>
>> val train = parsedata.map(x => (x.label,
>> MLUtils.appendBias(x.features))).cache()
>>
>>
>>
>> val numCorrections = 10
>>
>> val convergenceTol = 1e-4
>>
>> val maxNumIterations = 50
>>
>> val regParam = 0.1
>>
>> val initialWeightsWithIntercept = Vectors.dense(new Array[Double](2))
>>
>>
>>
>> val (weightsWithIntercept, loss) = LBFGS.runLBFGS(train,
>>
>>   new LeastSquaresGradient(),
>>
>>   new SquaredL2Updater(),
>>
>>   numCorrections,
>>
>>   convergenceTol,
>>
>>   maxNumIterations,
>>
>>   regParam,
>>
>>   initialWeightsWithIntercept)
>>
>>
>>
>> Did I implement LBFGS for Linear Regression via “LeastSquareGradient()”
>> correctly?
>>
>>
>>
>> Thanks
>>
>> Tri
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For 
> additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org

Reply via email to