You changed the labels only once, and have a test-set size of 4? I would 
imagine that is where that comes from.
If you repeat over different assignments, you will get 50/50.

On 04/27/2015 11:33 AM, Fabrizio Fasano wrote:
> Dear Andy,
>
> Yes, the classes have the same size, 8 and 8
>
> this is one example of code I used to cross validate classification (I used 
> here StratifiedShuffleSplit, but I also used other methods as leave one out 
> or simple 4-fold cross validation, and the result didn't change so much)
>
> from sklearn.cross_validation import StratifiedShuffleSplit
> sss = StratifiedShuffleSplit(y, 100, test_size=0.25, random_state=0)
> clf = svm.LinearSVC(penalty="l1", dual=False, C=1, random_state=1)
>
> cv_scores=[]
> for train_index, test_index in sss:
>    X_train, X_test = X_scaled[train_index], X_scaled[test_index]
>    y_train, y_test = y[train_index], y[test_index]
>    clf.fit(X_train, y_train)
>    y_pred = clf.predict(X_test)
>    cv_scores.append(np.sum(y_pred == y_test) / float(np.size(y_test)))
>
> print "Accuracy ", np.ceil(100*np.mean(cv_scores)), "+/-", 
> np.ceil(200*np.std(cv_scores))
>
>
>
>
> On Apr 26, 2015, at 7:50 PM, Andy wrote:
>
>> Your expectation is right, if you randomly assign labels, you shouldn't
>> get more than 50% correct with a large enough dataset.
>> I imagine there is some issue in how you shuffled the labels. Without
>> the code, it is hard to tell.
>> Are you sure the classes have the same size?
>>
>> On 04/26/2015 11:22 AM, Fabrizio Fasano wrote:
>>> Dear Andreas,
>>>
>>> Thanks a lot for your help,
>>>
>>> about the random assignment of values to my labels y. What I mean is that 
>>> being suspicious about the too good performances, I changed the labels 
>>> manually, retaining the 50% 1,0 but in different orders, and the labels 
>>> were always predicted very well, with accuracy no lower than 60%. I mean, 
>>> by chance I aspected values lower than 50% as well as values higher than 
>>> 50%. I didn't perform an exhaustive test (I only did it manually for few 
>>> combinations)...
>>>
>>> Fabrizio
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to