Hi, Yes you can use your labeled data (you will need to sub-sample your normal class to have similar proportion normal-abnormal) to learn your hyper-parameters through CV.
You can also try to use supervised classification algorithms on `not too highly unbalanced' sub-samples. Nicolas On Thu, Aug 4, 2016 at 5:17 PM, Amita Misra <amis...@ucsc.edu> wrote: > Hi, > > I am currently exploring the problem of speed bump detection using > accelerometer time series data. > I have extracted some features based on mean, std deviation etc within a > time window. > > Since the dataset is highly skewed ( I have just 5 positive samples for > every > 300 samples) > I was looking into > > One ClassSVM > covariance.EllipticEnvelope > sklearn.ensemble.IsolationForest > > but I am not sure how to use them. > > What I get from docs > separate the positive examples and train using only negative examples > > clf.fit(X_train) > > and then > predict the positive examples using > clf.predict(X_test) > > > I am not sure what is then the role of positive examples in my training > dataset or how can I use them to improve my classifier so that I can > predict better on new samples. > > > Can we do something like Cross validation to learn the parameters as in > normal binary SVM classification > > Thanks,? > Amita > > Amita Misra > Graduate Student Researcher > Natural Language and Dialogue Systems Lab > Baskin School of Engineering > University of California Santa Cruz > > > > > > -- > Amita Misra > Graduate Student Researcher > Natural Language and Dialogue Systems Lab > Baskin School of Engineering > University of California Santa Cruz > > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn