2012/10/30 Afik Cohen <[email protected]>:
>  Now, however, we've run into a problem when we tried to upgrade to
>  scikit-learn 0.13. It appears there have been significant changes to the
>  underlying LIBLINEANR library as well as changes to the svm/classes 
> interfaces;
>  a recent commit shows almost 4000 lines being removed from linear.cpp:
>  https://github.com/larsmans/scikit-
>   learn/commit/706319655a1380a154da92d5dd83128faf532881
>
>  Unfortunately, it appears our patch to the LIBLINEAR library to support
>  prediction probabilities for LinearSVC is now incompatible. Could someone 
> shed
>  some light on the reasoning behind this change to the core library and help 
> us
>  adapt our patch to the current state? We use LinearSVC because it trains the
>  fastest and gives the most accurate results, and even gives us prediction
>  probabilities after applying this patch. We'd like to continue doing so with
>  current and future versions of scikit!

You're interpreting the commit message wrongly. This patch doesn't
change a single line of code in Liblinear, and the few after it change
only very little; it just replaces 4000 lines of Cython-generated C
code from *our wrapper code* with a few lines of equivalent Python
code. The reasons for this change are safety, memory efficiency and
maintainability.

As for the fix, you can just copy over the predict_proba from
linear_model.LogisticRegression and you'll get the same result. Or
just paste this simpler version of the method into LinearSVC:

def predict_proba(self, X):
    scores = self.decision_function(X)
    prob = 1. / (1. + np.exp(-scores))
    if len(prob.shape) == 1:
        return np.vstack([1 - prob, prob]).T
    else:
        return prob / prob.sum(axis=1).reshape((prob.shape[0], -1))

As Andreas wrote, that doesn't make much sense mathematically and only
produces a kind of counterfeit probability. In the multiclass case,
it's even worse and the probabilities tend to be very different from
what a true multiclass LR would produce, so use this at your own risk.

-- 
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to