Hi,

I build a transformer model for spark svm for binary classification. I
basically implement the predictRaw method for classification and
classification model of spark api.

override def predictRaw(dataMatrix: Vector):Vector = {
  val m = weights.toBreeze.dot(dataMatrix.toBreeze) + intercept
  Vectors.dense(-m, m)
}


I have an imbalanced text dataset. The scores of logistic regression and
naive bayes for bag of words model is very high for author classification
with OneVsRest settings but the scores of SVM is very low.  I am using
standard parameters of SVM with 3000 maximum iteration in OneVsRest.
What might be the problem? I am using the same features (200125), labels
(9), ~1500 training instances, ~500 test instances and OneVsRest for all
the compared settings.


Thanks in advance...
Hayri Volkan Agun
PhD. Student - Anadolu University

Reply via email to