Is your dataset balanced (roughly as many positive as negative)? Kernel SVMs as implemented in scikit-learn do not scale with the number of samples: the computational cost is more than quadratic wrt n_samples. Either subsample (especially if you have a large imbalance), use an approximation such as Nystroem [1] feature expansion + linear model or use a more scalable non-linear algorithm such as RandomForestsClassifier.
[1] http://scikit-learn.org/stable/modules/kernel_approximation.html -- Olivier ------------------------------------------------------------------------------ Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis & security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
