Dong Wang created SPARK-1682:
--------------------------------

             Summary: LogisticRegressionWithSGD should support svmlight data 
and gradient descent w/o sampling
                 Key: SPARK-1682
                 URL: https://issues.apache.org/jira/browse/SPARK-1682
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
    Affects Versions: 1.0.0
            Reporter: Dong Wang
             Fix For: 1.0.0


The LogisticRegressionWithSGD example does not expose the following capability 
that already exist inside MLlib:
  * reading svmlight data
  * regularization with l1 and l2
  * add intercept
  * write model to a file
  * read model and generate predictions

The GradientDescent optimizer does sampling before a gradient step. When input 
data is already shuffled beforehand, it is possible to scan data and make 
gradient descent for each data instance. This could be potentially more 
efficient.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to