Hi Albert, Thank you for replying.
You are right, a high FPR might indicate an overfitting problem. I have been having discussions with friends and our insight so far is that I was worrying a non-existent problem. Feeding two dataset of both 'Normal Classes' into the 'decision_function' of OCSVM and read its AUROC would not give any info on the quality of Anomaly Detector. A meaningful reading only if feeding it with the 'normal' class and the 'anomaly' class. Again thank you for your kind reply. Best regards, Ady On 4/6/17, Albert Thomas <albertthoma...@gmail.com> wrote: > Hi Ady, > > Overfitting is a possible explanation. If your model learnt your normal > scenarios too well then every abnormal data will be predicted as abnormal > (so you will have a good performance for anomalies) however none of the > normal instances of the test set will be in the normal region (so you will > have a high FPR). > > Albert > > On Wed, 5 Apr 2017 at 15:37, Ady Wahyudi Paundu <awpau...@gmail.com> wrote: > >> Good day Scikit-Learn Masters, >> >> I have used Scikit-Learns OCSVM module previously with satisfying >> results. >> However on my current tasks I have this problem for one-class analysis: >> >> In my previous cases, I used OCSVM for Anomaly detector, and the >> normal classes in each cases were coming from one scenario. >> Now, I want to create one Anomaly detector system, with multiple >> normal scenario (in this case, 3 different normal scenario). Lets say >> I have scenario A, B and C, and I want to distinguish all data that is >> not coming from A and B and C. >> What I have been tried is combining all training data A and B and C >> into one data set and fit it using OCSVM module. When I tested the >> output model to several anomaly data-set it worked good. However, when >> I tested it against either one of the normal scenario, it gave a very >> high False Positives (AUROC: 99%). >> >> So my question, is it because a bad approach? by combining all the >> different normal data set into one training data set. >> Or is it because I was using it (the OCSVM) wrong? (i use 'rbf' kernel >> with nu and gamma set to 0.001) >> Or is it the case with wrong tools? another algorithm perhaps? >> >> I dont know if this is a proper question to ask here, so if it is not >> (maybe because this is just a Machine Learning question in general), >> just disregard it. >> >> Thank you in advance >> >> Best regards, >> Ady >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn