Re: Cross validation is missing in machine learning examples

2014-03-30 Thread Christopher Nguyen
Aureliano, you're correct that this is not validation error, which is computed as the residuals on out-of-training-sample data, and helps minimize overfit variance. However, in this example, the errors are correctly referred to as training error, which is what you might compute on a per-iteration

Cross validation is missing in machine learning examples

2014-03-29 Thread Aureliano Buendia
Hi, I notices spark machine learning examples use training data to validate regression models, For instance, in linear regressionhttp://spark.apache.org/docs/0.9.0/mllib-guide.htmlexample: // Evaluate model on training examples and compute training errorval valuesAndPreds = parsedData.map {