On Sat, Oct 4, 2014 at 1:09 AM, Andy <t3k...@gmail.com> wrote:
>
> I'm pretty sure that is wrong, unless you use the "decision_function"
> and not "predict_proba" or "predict".
> Mathieu said "predict" is used. Then it is still like a (very old
> school) neural network with a thresholding layer,
> and not like a linear model at all.
>
I don't think this is exactly like a neural network. In a neural network,
the non-linear activation functions are part of the objective function, so
they affect parameter estimation directly. Here, a linear SVC is first
fitted *then* its weight in the ensemble is estimated, given the
predictions fixed. Since np.sign (or predict_proba when available) is
applied post-hoc, it should affect neither the linear SVC model nor its
weight in the ensemble.
The main idea of AdaBoost is to increasingly focus on the difficult
examples. This suggests that weak learners should be diverse enough, i.e.,
they should disagree in their predictions on most examples. My intuition is
that a linear SVC doesn't fulfill this requirement. I would rather use a
weak learner (oracle) with high variance, low bias.
I would be curious to see how AdaBoost + LinearSVC fares on MNIST. Since
non-linear models outperform linear ones on this dataset, the results would
be a good indicator.
Mathieu
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general