I think this should represent the label of LabledPoint (0 means negative 1 means positive) http://spark.apache.org/docs/latest/mllib-data-types.html#labeled-point
The document you mention is for the mathematical formula, not the implementation. On Sun, Nov 29, 2015 at 9:13 AM, Tarek Elgamal <tarek.elga...@gmail.com> wrote: > According to the documentation > <http://spark.apache.org/docs/latest/mllib-linear-methods.html>, by > default, if wTx≥0 then the outcome is positive, and negative otherwise. I > suppose that wTx is the "score" in my case. If score is more than 0 and the > label is positive, then I return 1 which is correct classification and I > return zero otherwise. Do you have any idea how to classify a point as > positive or negative using this score or another function ? > > On Sat, Nov 28, 2015 at 5:14 AM, Jeff Zhang <zjf...@gmail.com> wrote: > >> if((score >=0 && label == 1) || (score <0 && label == 0)) >> { >> return 1; //correct classiciation >> } >> else >> return 0; >> >> >> >> I suspect score is always between 0 and 1 >> >> >> >> On Sat, Nov 28, 2015 at 10:39 AM, Tarek Elgamal <tarek.elga...@gmail.com> >> wrote: >> >>> Hi, >>> >>> I am trying to run the straightforward example of SVm but I am getting >>> low accuracy (around 50%) when I predict using the same data I used for >>> training. I am probably doing the prediction in a wrong way. My code is >>> below. I would appreciate any help. >>> >>> >>> import java.util.List; >>> >>> import org.apache.spark.SparkConf; >>> import org.apache.spark.SparkContext; >>> import org.apache.spark.api.java.JavaRDD; >>> import org.apache.spark.api.java.function.Function; >>> import org.apache.spark.api.java.function.Function2; >>> import org.apache.spark.mllib.classification.SVMModel; >>> import org.apache.spark.mllib.classification.SVMWithSGD; >>> import org.apache.spark.mllib.regression.LabeledPoint; >>> import org.apache.spark.mllib.util.MLUtils; >>> >>> import scala.Tuple2; >>> import edu.illinois.biglbjava.readers.LabeledPointReader; >>> >>> public class SimpleDistSVM { >>> public static void main(String[] args) { >>> SparkConf conf = new SparkConf().setAppName("SVM Classifier >>> Example"); >>> SparkContext sc = new SparkContext(conf); >>> String inputPath=args[0]; >>> >>> // Read training data >>> JavaRDD<LabeledPoint> data = MLUtils.loadLibSVMFile(sc, >>> inputPath).toJavaRDD(); >>> >>> // Run training algorithm to build the model. >>> int numIterations = 3; >>> final SVMModel model = SVMWithSGD.train(data.rdd(), numIterations); >>> >>> // Clear the default threshold. >>> model.clearThreshold(); >>> >>> >>> // Predict points in test set and map to an RDD of 0/1 values where >>> 0 is misclassication and 1 is correct classification >>> JavaRDD<Integer> classification = data.map(new >>> Function<LabeledPoint, Integer>() { >>> public Integer call(LabeledPoint p) { >>> int label = (int) p.label(); >>> Double score = model.predict(p.features()); >>> if((score >=0 && label == 1) || (score <0 && label == 0)) >>> { >>> return 1; //correct classiciation >>> } >>> else >>> return 0; >>> >>> } >>> } >>> ); >>> // sum up all values in the rdd to get the number of correctly >>> classified examples >>> int sum=classification.reduce(new Function2<Integer, Integer, >>> Integer>() >>> { >>> public Integer call(Integer arg0, Integer arg1) >>> throws Exception { >>> return arg0+arg1; >>> }}); >>> >>> //compute accuracy as the percentage of the correctly classified >>> examples >>> double accuracy=((double)sum)/((double)classification.count()); >>> System.out.println("Accuracy = " + accuracy); >>> >>> } >>> } >>> ); >>> } >>> } >>> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > > -- Best Regards Jeff Zhang