Thanks Joel. That's my intuition , but I don't quite get why that would
lead to identical scores. Every CV split should yield different training
and testing data, so even if the RBG stores trivial coefficients (and I
still don't know what trivial would mean here), the support vectors would
be different, and the performance would vary for the same exact kernel,
wouldn't they?

Josh


On Thu, Aug 1, 2013 at 12:26 AM, Joel Nothman
<[email protected]>wrote:

> I think all those results correspond to the RBF kernel. You have far too
> few samples to learn an RBF model, so it's stored trivial coefficients
> independent of C and gamma.
>
>
> On Thu, Aug 1, 2013 at 1:56 PM, Josh Wasserstein 
> <[email protected]>wrote:
>
>> Hi,
>>
>> I am noticing that for some models in my grid search I get virtually the
>> same exact results across 100 repetitions of CV. Is this normal? In case it
>> matters, I am working with ~30 data points (I know, it's a small dataset)
>> with ~5 dimensions.
>>
>> Below are the details of the configuration that I used for grid search:
>>
>> with K=4:
>> sfs = StratifiedShuffleSplit(y,n_iter=100,test_size=1.0/K)
>>
>> I am working on a 3-labels classification problem with the following SVM
>> kernels:
>>
>>   tuned_parameters = [
>>                       {'kernel': ['linear'],  'C':
>> np.power(2,np.arange(-8.,8., step_size)},
>>                       {'kernel': ['rbf'],     'C':
>> np.power(2,np.arange(-8.,8., step_size), 'gamma':
>> np.power(2,np.arange(-8.,8., step_size)},
>>                ]
>>
>> #....#
>> clf = GridSearchCV(SVC(C=1, cache_size=5000),
>>  tuned_parameters,
>>  scoring=f1_macro,
>>  verbose=1, n_jobs=1, cv=sfs)
>> clf.fit(X,y)
>> #....#
>>
>> Below is the plotting of the *cv_validation_scores* (mean, min, max,
>> mean-std and mean+std) from *clf.cv_scores*
>>
>> More specifically:
>>
>>   all_scores = [x.cv_validation_scores for x in clf.cv_scores_]
>>   all_scores = np.vstack(all_scores).transpose()
>>
>>   # Load the scores in a dataframe in pandas and sort the columns (the
>> models)
>>   all_scores_df = pd.DataFrame(all_scores)
>>    sorted_columns = all_scores_df.mean().order(ascending=False).index
>>   sorted_scores = all_scores_df.reindex_axis(sorted_columns, axis=1)
>>
>>   # Plot envelope:
>>   max_values  = sorted_scores.max().values
>>   min_values  = sorted_scores.min().values
>>   mean_values = sorted_scores.mean().values
>>   std_values  = sorted_scores.std().values
>>
>>   fig = plt.figure()
>>   fig.hold(True)
>>   plt.plot(max_values, color='r')
>>   plt.plot(min_values, color='r')
>>   plt.plot(mean_values, color='b')
>>
>>   above = mean_values + std_values
>>   above = np.minimum(above,max_values)
>>   plt.plot(above, color='g', linestyle='--', linewidth=2.0)
>>   below = mean_values - std_values
>>   below = np.maximum(below,min_values)
>>   plt.plot(below, color='g', linestyle='--', linewidth=2.0)
>>
>> [image: Inline image 1]
>> And here is an example of one of those models:
>>
>> > clf.cv_scores_[8].cv_validation_scores
>>
>> array([ 0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376,
>>         0.21505376,  0.21505376,  0.21505376,  0.21505376,  0.21505376])
>>
>> Thanks,
>>
>> Josh
>>
>>
>> ------------------------------------------------------------------------------
>> Get your SQL database under version control now!
>> Version control is standard for application code, but databases havent
>> caught up. So what steps can you take to put your SQL databases under
>> version control? Why should you start doing it? Read more to find out.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Get your SQL database under version control now!
> Version control is standard for application code, but databases havent
> caught up. So what steps can you take to put your SQL databases under
> version control? Why should you start doing it? Read more to find out.
> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>

<<f1_macro_envelope.jpg>>

------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to