Hi User group,

We are using spark Linear Regression with SGD as the optimization technique and 
we are achieving very sub-optimal results.

Can anyone shed some light on why this implementation seems to produce such 
poor results vs our own implementation?

We are using a very small dataset, but we have to use a very large number of 
iterations to achieve similar results to our implementation, we’ve tried 
normalizing the data
not normalizing the data and tuning every param. Our implementation is a closed 
form solution so we should be guaranteed convergence but the spark one is not, 
which is
understandable, but why is it so far off?

Has anyone experienced this?

Steve Carman, M.S.
Artificial Intelligence Engineer
Coldlight-PTC
scar...@coldlight.com
This e-mail is intended solely for the above-mentioned recipient and it may 
contain confidential or privileged information. If you have received it in 
error, please notify us immediately and delete the e-mail. You must not copy, 
distribute, disclose or take any action in reliance on it. In addition, the 
contents of an attachment to this e-mail may contain software viruses which 
could damage your own computer system. While ColdLight Solutions, LLC has taken 
every reasonable precaution to minimize this risk, we cannot accept liability 
for any damage which you sustain as a result of software viruses. You should 
perform your own virus checks before opening the attachment.

Reply via email to