[Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Maheshakya Wijewardena
Hi all, I would like to know whether we have bootstrap aggregating functionality in scikit-learn library. If so, How do I use that? (If it doesn't exist I would like to implement that explicitly to cohere with the learning algorithms we have in scikit-learn) Thank you

Re: [Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Gilles Louppe
Hi, Such ensembles are not implemented at the moment. Gilles On 21 June 2013 09:59, Maheshakya Wijewardena wrote: > Hi all, > I would like to know whether we have bootstrap aggregating functionality in > scikit-learn library. If so, How do I use that? > (If it doesn't exist I would like to imp

Re: [Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Maheshakya Wijewardena
I'm doing a brownfield development for a university project and I'm so interested in this field. If I start implementing that kind ensemble method, will it suit in the scope of this scikit-learn project. Will it be useful for the users?( I've felt the need of that personally. It has improved the re

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-21 Thread Maheshakya Wijewardena
can anyone give me a sample algorithm for one hot encoding used in scikit-learn? On Thu, Jun 20, 2013 at 8:37 PM, Peter Prettenhofer < peter.prettenho...@gmail.com> wrote: > you can try an ordinal encoding instead - just map each categorical value > to an integer so that you end up with 8 numeri

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-21 Thread Peter Prettenhofer
? you already use one-hot encoding in your example ( preprocessing.OneHotEncoder) 2013/6/21 Maheshakya Wijewardena > can anyone give me a sample algorithm for one hot encoding used in > scikit-learn? > > > On Thu, Jun 20, 2013 at 8:37 PM, Peter Prettenhofer < > peter.prettenho...@gmail.com> wro

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-21 Thread Maheshakya Wijewardena
I'd like to analyse a bit and encode using that method to cohere with random forests in scikit-learn. On Fri, Jun 21, 2013 at 2:08 PM, Peter Prettenhofer < peter.prettenho...@gmail.com> wrote: > ? you already use one-hot encoding in your example ( > preprocessing.OneHotEncoder) > > > 2013/6/21 M

Re: [Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Olivier Grisel
2013/6/21 Gilles Louppe : > Hi, > > Such ensembles are not implemented at the moment. Ensembles of trees have a `bootstrap` parameter that do bagging, although they also randomize the feature selection and optionally split locations. -- Olivier ---

Re: [Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Maheshakya Wijewardena
So that means that bagging can only be applied to trees. How about implementing a general module so that it can be applied on more learning algorithms. On Fri, Jun 21, 2013 at 4:17 PM, Olivier Grisel wrote: > 2013/6/21 Gilles Louppe : > > Hi, > > > > Such ensembles are not implemented at the mom

Re: [Scikit-learn-general] Using Random forest classifier after One hot encoding

2013-06-21 Thread federico vaggi
What do you mean? It's pretty trivial to implement a one-hot encoding, the issue is that if you use a non-sparse format then you'll end up with a matrix which is far too dense to be practical, for anything but trivial examples. On Fri, Jun 21, 2013 at 10:46 AM, Maheshakya Wijewardena < pmaheshak

Re: [Scikit-learn-general] Shade/tint a segment

2013-06-21 Thread Andreas Mueller
Hi Michael. I think you wanted the scikit-image mailing list, this is scikit-learn. You can overlay a tinted segment using imshow and the alpha parameter. That's how I do is usually. Cheers, Andy On 06/21/2013 12:21 AM, Brickle Macho wrote: > I over segment an image using a superpixel algorithm.

Re: [Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Andreas Mueller
On 06/21/2013 12:56 PM, Maheshakya Wijewardena wrote: > So that means that bagging can only be applied to trees. How about > implementing a general module so that it can be applied on more > learning algorithms. > I think that would be great. You should look at the forest implementation to get st

Re: [Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Maheshakya Wijewardena
Thank you. I'll have look at the forest implementation and check what can be done and inform you. I'd like to have a look at Gilles s code. If it's convenient, can you tell how you tried to implement that? best Maheshakya On Fri, Jun 21, 2013 at 6:55 PM, Andreas Mueller wrote: > On 06/21/2013

Re: [Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Andreas Mueller
On 06/21/2013 03:37 PM, Maheshakya Wijewardena wrote: > Thank you. I'll have look at the forest implementation and check what > can be done and inform you. > I'd like to have a look at Gilles s code. If it's convenient, can you > tell how you tried to implement that? I think it was mostly removin

Re: [Scikit-learn-general] Bootstrap aggregating

2013-06-21 Thread Maheshakya Wijewardena
Ok, I got it. I'll look at the code and see what can be done. Thank you. On Fri, Jun 21, 2013 at 7:17 PM, Andreas Mueller wrote: > On 06/21/2013 03:37 PM, Maheshakya Wijewardena wrote: > > Thank you. I'll have look at the forest implementation and check what > > can be done and inform you. > >

Re: [Scikit-learn-general] SVM: select the training set randomly

2013-06-21 Thread Gianni Iannelli
Thank You very much for the link!! It does closely what I wanna do! In my case I have two classes that are for example 0 and 1. I wanna keep the distribution (in the training set and so also the test set) between them similar. And I also need that are choosen randomly, I don't care if in one case

Re: [Scikit-learn-general] SVM: select the training set randomly

2013-06-21 Thread Roban Kramer
StratifiedKFold will keep the class distribution the same for you: http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.StratifiedKFold.html#sklearn.cross_validation.StratifiedKFold There are lots of metrics (score functions, etc.) available: http://scikit-learn.org/stable/m

Re: [Scikit-learn-general] SVM: select the training set randomly

2013-06-21 Thread Gianni Iannelli
StratifiedKFold will keep the class distribution the same for you: http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.StratifiedKFold.html#sklearn.cross_validation.StratifiedKFold I was looking at this, it is written: This cross-validation object is a variation of KFold, whi

Re: [Scikit-learn-general] SVM: select the training set randomly

2013-06-21 Thread Roban Kramer
Oh sorry, I was thinking of balanced sets for cross validation, rather than a training and testing split. I don't know of a convenience routine specifically for producing stratified training and testing sets. If both your classes have decent support and the training and testing set sizes aren't too

Re: [Scikit-learn-general] SVM: select the training set randomly

2013-06-21 Thread Gianni Iannelli
Ah ok! Yeah, I was thinking that having in my dataset 50/50 (also 40/60) of my dataset for the two classes will be not a problem but since that the ratio is 1/3 I would prefere to have the same distribution for the two, then my choose to use the train_test_split method. I don't know if there are

Re: [Scikit-learn-general] SVM: select the training set randomly

2013-06-21 Thread Gianni Iannelli
Found the error...I post below. The problem is that metrics.confusion_matrix accept lists and not numpy.array. So I converted everything in list: #Compute the confusion matrixy_testlist_tmp = y_test.transpose().tolist()y_testlist = y_testlist_tmp[0]resultlist = result.tolist()

[Scikit-learn-general] Optimization of the SVM parameters

2013-06-21 Thread Gianni Iannelli
Dear All, I'm stuck with a problem and I don't know if it's a bug. I'm defining the optimization parameter C and gamma for my SVM in this way: C = 10.0 ** numpy.arange(-3, 9)gamma = 10.0 ** numpy.arange(-6, 4)param_grid = dict(gamma=gamma, C=C)svr = svm.SVC(kernel='rbf')clfopt = grid_search.Grid