I'm working on a problem learning several different sets of responses
against the same set of training features. Right now I've written the
program to cycle through all of the different label sets, attached them to
the training data and run LogisticRegressionWithSGD on each of them. ie

foreach curResponseSet in allResponses:
     currentRDD : RDD[LabeledPoints] = curResponseSet joined with
trainingData
     LogisticRegressionWithSGD.train(currentRDD)


Each of the different training runs are independent. It seems like I should
be parallelize them as well.
Is there a better way to do this?


Kyle

Reply via email to