[Scikit-learn-general] Different results in R and sklearn

Timothy Vivian-Griffiths Tue, 20 Jan 2015 08:30:37 -0800

Hi Andy,

Firstly, the dimensions that I gave were wrong, you're right. The inputs are 
correct but the target vector shape is (7763,) so there are that many samples 
with 125 features (that is in the smaller dataset I am using, the other has 
over 30,000 features but I haven't tried that one in R yet).

I am using the svm model from the e1071 library in R. The documentation states 
that this uses libsvm as well. And I have tried to set as many parameters to be 
the same as possible (those that I can remember are: kernel, cost, gamma, 
cache_size, shrinking and tolerance. Come to think of it, I haven't tried 
setting the random_state for either of them, so I'll give that a go when I can, 
but I don't know if that will be the same across software anyway.

But, I am definitely loading in the same data and the R version is not giving 
only 0s (for C=1, kernel='rbf'). I will also compare the performance with some 
other kernels and parameters when I can as well.

Tim

> 
> On 01/19/2015 10:43 AM, Timothy Vivian-Griffiths wrote:
>> I have used this same dataset and parameters in Rs implementation of an SVM, 
>> and it is not outputting all 0s, so I don't think that it's a particular 
>> problem with the data. .
> This seems odd. What implementation are you using in R?
> Scikit-learn uses libsvm, which is more or less the reference 
> implementation for kernel SVMs.
> Maybe the R package you are using parametrizes the SVM in a different way.
> 
> Btw, you said:
> 
> for interest the inputs matrix had shape (7763, 125) and the target 
> vector (125,):
> 
> That can not be. The input needs to be (n_samples, n_features) and the 
> target (n_samples,)
> Do you only have 125 samples and 7763 features?
> That is very few samples for an RBF-SVM ....
> 

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] Different results in R and sklearn

Reply via email to