It's always better to use a quasi newton solver if the runtime and problem
scale permits as there are guarantees on opti mization...owlqn and bfgs are
both quasi newton

Most single node code bases will run quasi newton solves....if you are
using sgd better is to use adadelta/adagrad or similar tricks...David added
some of them in breeze recently...
On Jun 9, 2015 7:25 PM, "DB Tsai" <dbt...@dbtsai.com> wrote:

> As Robin suggested, you may try the following new implementation.
>
>
> https://github.com/apache/spark/commit/6a827d5d1ec520f129e42c3818fe7d0d870dcbef
>
> Thanks.
>
> Sincerely,
>
> DB Tsai
> ----------------------------------------------------------
> Blog: https://www.dbtsai.com
> PGP Key ID: 0xAF08DF8D
> <https://pgp.mit.edu/pks/lookup?search=0x59DF55B8AF08DF8D>
>
> On Tue, Jun 9, 2015 at 3:22 PM, Robin East <robin.e...@xense.co.uk> wrote:
>
>> Hi Stephen
>>
>> How many is a very large number of iterations? SGD is notorious for
>> requiring 100s or 1000s of iterations, also you may need to spend some time
>> tweaking the step-size. In 1.4 there is an implementation of ElasticNet
>> Linear Regression which is supposed to compare favourably with an
>> equivalent R implementation.
>> > On 9 Jun 2015, at 22:05, Stephen Carman <scar...@coldlight.com> wrote:
>> >
>> > Hi User group,
>> >
>> > We are using spark Linear Regression with SGD as the optimization
>> technique and we are achieving very sub-optimal results.
>> >
>> > Can anyone shed some light on why this implementation seems to produce
>> such poor results vs our own implementation?
>> >
>> > We are using a very small dataset, but we have to use a very large
>> number of iterations to achieve similar results to our implementation,
>> we’ve tried normalizing the data
>> > not normalizing the data and tuning every param. Our implementation is
>> a closed form solution so we should be guaranteed convergence but the spark
>> one is not, which is
>> > understandable, but why is it so far off?
>> >
>> > Has anyone experienced this?
>> >
>> > Steve Carman, M.S.
>> > Artificial Intelligence Engineer
>> > Coldlight-PTC
>> > scar...@coldlight.com
>> > This e-mail is intended solely for the above-mentioned recipient and it
>> may contain confidential or privileged information. If you have received it
>> in error, please notify us immediately and delete the e-mail. You must not
>> copy, distribute, disclose or take any action in reliance on it. In
>> addition, the contents of an attachment to this e-mail may contain software
>> viruses which could damage your own computer system. While ColdLight
>> Solutions, LLC has taken every reasonable precaution to minimize this risk,
>> we cannot accept liability for any damage which you sustain as a result of
>> software viruses. You should perform your own virus checks before opening
>> the attachment.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> > For additional commands, e-mail: user-h...@spark.apache.org
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to