Hi! I want to be able to run each fold of a k-fold cross validation fold in
parallel, using all of my 6 CPUs at once. My model is a hidden markov model and
I want to train it using the training portion of the data and then extract the
anomaly score (negative log-likelihood) of each test sequence of the test
portion with every fold and use ROC as an evaluation technique with every fold.
I have found the function cross_validate() which seems to provide the option of
running things in parralel with n_jobs = -1.
I assume the estimator is then my HMM model.
As of now I'm using pomegranate to train the model and extract the anomaly
score of the test sequences.
I don't understand how to call the cross_validate function with the right
arguments for my HMM model. All examples I've seen havn't used HMM. I'm
confused on where to specify the hidden states number if Im not callign my
usual pomegranate function from_samples(), which I've used before.
Also how can I extract the anomay scores within each fold using this function?
I'm unsure what exactly is happening with in the cross_validate function and
how to control it the way I need.
If anyone has an example or explanation or another idea on how to run the folds
in parallel, I would really appreciate it!
This is my attempt of using cross_validate, which gets stuck or seems to not be
running through (although I'm quite sure I'm not using it properly):
import pomegranate
import sklearn
model = pomegranate.HiddenMarkovModel()
results = cross_validate(model, listToUse, y=None, groups=None, scoring=None,
cv=3, n_jobs=-1, verbose=10)
print(results)
Below is how I've manually set my cross-validation up as of now:
listExample = []
kfold = KFold(10, True)
for train, test in kfold.split(listToUse):
listExample.append([listToUse[train], listToUse[test]])
scoreList = []
for ex in listExample:
hmmModel = hmm.hmm(ex[0])
scoreListFold = []
mid = time.time()
for li in ex[1]:
prob = hmmModel.log_probability(li)
scoreListFold.append(prob)
scoreList.append(numpy.mean(scoreListFold))
avg = numpy.mean(scoreList)
Thanks again!
Anni
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn