Hi User group, We are using spark Linear Regression with SGD as the optimization technique and we are achieving very sub-optimal results.
Can anyone shed some light on why this implementation seems to produce such poor results vs our own implementation? We are using a very small dataset, but we have to use a very large number of iterations to achieve similar results to our implementation, we’ve tried normalizing the data not normalizing the data and tuning every param. Our implementation is a closed form solution so we should be guaranteed convergence but the spark one is not, which is understandable, but why is it so far off? Has anyone experienced this? Steve Carman, M.S. Artificial Intelligence Engineer Coldlight-PTC scar...@coldlight.com This e-mail is intended solely for the above-mentioned recipient and it may contain confidential or privileged information. If you have received it in error, please notify us immediately and delete the e-mail. You must not copy, distribute, disclose or take any action in reliance on it. In addition, the contents of an attachment to this e-mail may contain software viruses which could damage your own computer system. While ColdLight Solutions, LLC has taken every reasonable precaution to minimize this risk, we cannot accept liability for any damage which you sustain as a result of software viruses. You should perform your own virus checks before opening the attachment.