Hello,

This is my first post to the list, I have been recently in touch with
Alexandre Gramfort, and I would be very interested in exploring some
outlier/anomaly detection algorithms, before eventually put it in a
compatible scikit learn API (with a view to eventually merge it).

I'm not particularly aware of the state-of-the-art in the efficience of
such algorithms, I have just read some surveys and other litterature on it,
and my conclusion is that exploring the following classical methods would
be productive :


- density-based algorithms : LOF (Local Outlier Factor) and its variations
(other algorithms using relative density/k-NN) such as COF
(Connectivity-based Outlier Factor), ODIN (Outlier Detection using Indegree
Number), LOCI (Local Correlation Integral).

LOF : http://www.dbs.ifi.lmu.de/Publikationen/Papers/LOF.pdf

COF : http://www.cse.cuhk.edu.hk/~adafu/Pub/pakdd02.pdf

ODIN : ftp://193.167.42.127/pub/franti/papers/Hautamaki/P2.pdf

LOCI : http://www.dtic.mil/dtic/tr/fulltext/u2/a461085.pdf


- high-dimensional approach :  « Aggarwal and Yu algorithm »

http://www.researchgate.net/publication/2401320_Outlier_Detection_for_High_Dimensional_Data/file/e0b49525c3e5f60b5e.pdf


-  iForest (Isolation Forest), which seems very interesting because it does
not rely on any distance or density measure.

http://cs.nju.edu.cn/zhouzh/zhouzh.files/publication/icdm08b.pdf?q=isolation


So please let me know if some of these algorithms (or others) may generate
a particular interest.

Anyway I'd be very glad to get any feedback on it.


Cheers,

Nicolas
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to