Hi, I am running LogisticRegressionWithSGD in spark 1.4.1 and it always takes 100 iterations to train (which is the default). It never meets the convergence criteria, shouldn't the convergence criteria for SGD be based on difference in logloss or the difference in accuracy on a held out test set ?
Code for convergence criteria: https://github.com/apache/spark/blob/c0e9ff1588b4d9313cc6ec6e00e5c7663eb67910/mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala#L251 Thanks, Nishanth