Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 57, Issue 5

2014-10-06 Thread Gael Varoquaux
On Mon, Oct 06, 2014 at 05:35:12PM -0400, Alan G Isaac wrote: > On 10/6/2014 Gael Varoquaux wrote: > > Parallel computing has problems with lambda functions. > Can you elaborate on that please? Lambdas don't pickle. Parallel computing needs pickling (or it needs to do hacks). Gaël -

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 57, Issue 5

2014-10-06 Thread Alan G Isaac
On 10/6/2014 Gael Varoquaux wrote: > Parallel computing has problems with lambda functions. Can you elaborate on that please? I'm aware of the currying problem: http://stackoverflow.com/questions/11371009/parallel-mapping-functions-in-ipython-w-multiple-parameters But that is not a problem with la

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 57, Issue 5

2014-10-06 Thread Manoj Kumar
This is because multiprocessing in python cannot handle functions that cannot be pickled. I've learnt this the tough way. You can maybe have a look at this, http://matthewrocklin.com/blog/work/2013/12/05/Parallelism-and-Serialization/ On Mon, Oct 6, 2014 at 10:48 PM, Gael Varoquaux < gael.varoqu.

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 57, Issue 5

2014-10-06 Thread Gael Varoquaux
On Mon, Oct 06, 2014 at 07:50:42PM +0100, Dominic Steinitz wrote: > > Try not using a lambda function, but a fully-feldged function. Parallel > > computing has problems with lambda functions. > Do you mean parallel computing generally or in Python? In Python. Gaël --

Re: [Scikit-learn-general] gridSearchCV parallel processing

2014-10-06 Thread Gael Varoquaux
On Mon, Oct 06, 2014 at 06:27:21PM +, Pagliari, Roberto wrote: > I’d like to use multiprocessing module to run different tasks at the same time > (each of which may run grid search). > Are there any known issues when using this module with gridSearchCV (and njobs > >1 ), or anything I should

Re: [Scikit-learn-general] DecisionTreeClassifier.tree_.value mapping to class

2014-10-06 Thread Gael Varoquaux
On Mon, Oct 06, 2014 at 04:56:22PM -0300, Nicolas Emiliani wrote: > This will print out: >   >     >>> array([[  0.,   1.,  68.]]) > but ... how do I know which position in that array belongs to which class ? > The classifier has a classes_ attribute which is also a list >     >>> clf.classes_ >

[Scikit-learn-general] DecisionTreeClassifier.tree_.value mapping to class

2014-10-06 Thread Nicolas Emiliani
Hi! I am using a scikit-learn DecissionTreeClassifier on a 3 class dataset. After I fit the classifier I access all leaf nodes on the tree_ attribute in order to get the amount of instances that end up in a given node for each class. clf = tree.DecisionTreeClassifier(max_depth=5) clf.fit(

Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol 57, Issue 5

2014-10-06 Thread Dominic Steinitz
Dominic Steinitz domi...@steinitz.org http://idontgetoutmuch.wordpress.com > > Message: 2 > Date: Mon, 6 Oct 2014 15:39:43 +0200 > From: Gael Varoquaux > Subject: Re: [Scikit-learn-general] cross_val_score crashes python > every time > To: scikit-learn-general@lists.sourceforge.net > Mess

Re: [Scikit-learn-general] error when using linear SVM with AdaBoost

2014-10-06 Thread Pagliari, Roberto
Hi Matthieu, Which dataset are you referring to? Thanks From: Mathieu Blondel [mailto:math...@mblondel.org] Sent: Saturday, October 04, 2014 10:13 AM To: scikit-learn-general Subject: Re: [Scikit-learn-general] error when using linear SVM with AdaBoost On Sat, Oct 4, 2014 at 1:09 AM, Andy ma

[Scikit-learn-general] gridSearchCV parallel processing

2014-10-06 Thread Pagliari, Roberto
I'd like to use multiprocessing module to run different tasks at the same time (each of which may run grid search). Are there any known issues when using this module with gridSearchCV (and njobs >1 ), or anything I should consider when doing this? Thank you,

Re: [Scikit-learn-general] Better results with R than with Scikit

2014-10-06 Thread Andy
Hi Zoraida. I am not expert in R glms but I think the glm call just does logistic regression. For the binary case, this is the same as sklearn.linear_model.LogisticRegression. Just a wild guess: Did you use clf.decision function results as input to roc_auc_score? If you use clf.predict results

Re: [Scikit-learn-general] Better results with R than with Scikit

2014-10-06 Thread Peter Prettenhofer
thanks Olivier -- much appreciated - this will de-militarize my conversations in English a lot. 2014-10-06 16:36 GMT+02:00 Olivier Grisel : > 2014-10-06 15:27 GMT+02:00 Peter Prettenhofer < > peter.prettenho...@gmail.com>: > > > > Both scikit-learn and R (glmnet) should be thoroughly documented.

Re: [Scikit-learn-general] Better results with R than with Scikit

2014-10-06 Thread Olivier Grisel
2014-10-06 15:27 GMT+02:00 Peter Prettenhofer : > > Both scikit-learn and R (glmnet) should be thoroughly documented. ML tools > have come a long way and are very robust and usable these days but they are > not completely fire-and-forget**. > > ** sorry for the military term but I lack a good alter

Re: [Scikit-learn-general] cross_val_score crashes python every time

2014-10-06 Thread Gael Varoquaux
> I'm trying to use cross_val_score inside a lambda function to take full > advantage of my processors - Try not using a lambda function, but a fully-feldged function. Parallel computing has problems with lambda functions. Gaël

Re: [Scikit-learn-general] Better results with R than with Scikit

2014-10-06 Thread Peter Prettenhofer
Hi Zoraida, can you provide a code snippet (e.g. upload it to gist.github.com) that illustrates the problem -- especially how you evaluate the goodness of the predictions (both R and scikit-learn)? Its pretty difficult to argue about the issue without seeing what you actually do. The difference be

[Scikit-learn-general] Better results with R than with Scikit

2014-10-06 Thread ZORAIDA HIDALGO SANCHEZ
Hi all, I know the subject is ugly but I don¹t really know how to call it. I am newbie with all this machine learning techniques and what I do most of the time is to follow a ³try and error² approach. I now this method has some inconvenients but for now is what I am able to do. I am working with

Re: [Scikit-learn-general] Access to path coefficients in LassoCV

2014-10-06 Thread Olivier Grisel
You can use the `lasso_path` function to get the full path: http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.lasso_path.html#sklearn.linear_model.lasso_path Here is an example: http://scikit-learn.org/stable/auto_examples/linear_model/plot_lasso_coordinate_descent_path

Re: [Scikit-learn-general] Access to path coefficients in LassoCV

2014-10-06 Thread Fabien
Hi Michael, thanks for your answer, it helped a lot already. On 06.10.2014 11:06, Michael Eickenberg wrote: > may prefer using e.g. LassoLarsCV. I gave it a shot and indeed the coefs are available, and it worked really fine on my well designed test problems. However, with my real use case it p

Re: [Scikit-learn-general] Access to path coefficients in LassoCV

2014-10-06 Thread Michael Eickenberg
Hi Fabien, welcome to the list! If you are interested in the exact locations of the kinks in the coefficient path, you may prefer using e.g. LassoLarsCV. It works on your size of problem (iff "highly collinear" doesn't mean "basically equal") and has the attributed "coef_path_" ( https://github.c

Re: [Scikit-learn-general] cross_val_score crashes python every time

2014-10-06 Thread Olivier Grisel
There might be a problem with running multiprocessing (that is used internally by cross_val_score with n_jobs=-1) in concurrent Python threads. BTW, why do you use threads in the first place? -- Olivier -- Slashdot TV.

[Scikit-learn-general] Access to path coefficients in LassoCV

2014-10-06 Thread Fabien
Folks, this is my first message on this newsgroup, so first: Hi! I have two questions, I hope they are not too trivial: 1. Access to coefficients in LassoCV I use LassoCV to find the optimal alpha for my problem. For analysis purposes I'd like to get access to the paths coefficients, more or le