@ChungHung Liu
The smote algorithm works by forming linear combinations of nearest
neighbors of minority classes.
So for instance you choose N. Then choose arbitrary member of minority
class say A
Look at A's N nearest-neighbors. M <= N will also be members of the
minority class.
Let B be a member of Neighborhood(A). Smote will form a linear combination
of A and B.
The number of generated examples is of course dependent on the amount you
wish to upsample.
The above scheme generally fails when the minority class doesn't form
clusters.
As Smote works by attempting to upsample to "carve" out a region of the
space, but if the minority class is surrounded by majority members it won't
perform very well.
You should also look up "TOMEK LINKS" and see how their removal after
performing smote might help your performance.
--
https://github.com/bearrito
@deepbearrito
------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general