tanyinyan created SPARK-6349:
--------------------------------

             Summary: Add probability estimates in SVMModel predict result
                 Key: SPARK-6349
                 URL: https://issues.apache.org/jira/browse/SPARK-6349
             Project: Spark
          Issue Type: New Feature
          Components: MLlib
    Affects Versions: 1.2.1
            Reporter: tanyinyan


In SVMModel, predictPoint method output raw margin(threshold not set) or 1/0 
label(threshold set). 

when SVM are used as a classifier, it's hard to find a good threshold,and the 
raw margin is hard to understand. 

when I am using SVM on 
dataset(https://www.kaggle.com/c/avazu-ctr-prediction/data), train on the first 
day's dataset(ignore field id/device_id/device_ip, all remaining fields are 
concidered as categorical variable, and sparsed before SVM) and predict on the 
same data with threshold cleared, the predict result are all  negative. I have 
to set threshold to -1 to get a reasonable confusion matrix.

So, I suggest to provide probability predict result in SVMModel as in 
libSVM(Platt's binary SVM Probablistic Output)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to