date:20130731

Re: [Scikit-learn-general] problem with parallel computing on windows xp

2013-07-31 Thread Gael Varoquaux

On Thu, Aug 01, 2013 at 11:46:09AM +0800, Shuo Wang wrote: > ImportError: [joblib] Attempting to do parallel computingwithout protecting > your import on a system that does not support forking. To use > parallel-computing in a script, you must protect you main loop using "if > __name__ == '__main__

Re: [Scikit-learn-general] Scikit-learn alpha release

2013-07-31 Thread Gael Varoquaux

Hey Chris, This is good news. The problems are fairly minor. Don't worry about the issue. These tests failing are numerically unstable ones. We'll see what we can do about them, but they are not release blockers. The good news is that we don't have major building or linking problem. Thanks a lot!

Re: [Scikit-learn-general] Identical scores across repetitions of repeated CV ?? (figure included)

2013-07-31 Thread Joel Nothman

I think all those results correspond to the RBF kernel. You have far too few samples to learn an RBF model, so it's stored trivial coefficients independent of C and gamma. On Thu, Aug 1, 2013 at 1:56 PM, Josh Wasserstein wrote: > Hi, > > I am noticing that for some models in my grid search I get

[Scikit-learn-general] Identical scores across repetitions of repeated CV ?? (figure included)

2013-07-31 Thread Josh Wasserstein

Hi, I am noticing that for some models in my grid search I get virtually the same exact results across 100 repetitions of CV. Is this normal? In case it matters, I am working with ~30 data points (I know, it's a small dataset) with ~5 dimensions. Below are the details of the configuration that I

[Scikit-learn-general] problem with parallel computing on windows xp

2013-07-31 Thread Shuo Wang

Hi, I am trying to run 4 jobs on windows xp with sklearn 0.13.1 model = RandomForestRegressor(n_estimators=500, compute_importances =True, n_jobs =4) I am receiving the following error Traceback (most recent call last): File "", line 1, in File "C:\Python27\lib\multiprocessing\forking.py",

[Scikit-learn-general] Scikit-learn alpha release

2013-07-31 Thread Gael Varoquaux

As Christoph, I am contacting you because you are the guy that rocks and provides fantastically useful binaries of many scientific-computing packages under Windows. We (the scikit-learn team) are going to release a new version of scikit-learn. I have tagged the alpha release and uploaded the sourc

Re: [Scikit-learn-general] GridSearchCV with multi-label: ROC-AUC-equivalent metrics

2013-07-31 Thread Arnaud Joly

It's what they have done in the mulan library. Arnaud On 19 Jul 2013, at 13:24, Olivier Grisel wrote: > 2013/7/19 Arnaud Joly : >> You can probably average the precision recall curve >> or use some ranking metrics [1]. >> >> Arnaud >> >> [1] Mining Multi-label Data >> http://lkm.fri.uni-lj.s

Re: [Scikit-learn-general] random forest string data

2013-07-31 Thread Oğuz Yarımtepe

{"word": vocabulary[word], ...} the trained data is lie [[0.0, 1.0, 'xxx', 'yyy', '13.0', ...], ] so when i use DictVectorizer it will create an array when i run fit_transform somethign like array([[ 1., 0.], [ 0., 1.]]) with different shape and data. I am not sure how i will repla

Re: [Scikit-learn-general] random forest string data

2013-07-31 Thread Lars Buitinck

2013/7/31 Oğuz Yarımtepe : > How will i use DictVectorizer for string values above? It won't do categorical integer coding directly. You can keep a separate dict of the string values, say vocabulary, then feed DictVectorizer dicts of the form {"word": vocabulary[word], ...} -- Lars Buitinck

Re: [Scikit-learn-general] random forest string data

2013-07-31 Thread Oğuz Yarımtepe

On Mon, Jul 29, 2013 at 12:19 AM, Ross Boucher wrote: > Interesting, I've been using DictVectorizer (and one hot coded categorical > data) with Random Forests and getting decent results. Is this just > coincidental, and will I see better results if I combine the categorical > data into a single c

Re: [Scikit-learn-general] random forest string data

2013-07-31 Thread Oğuz Yarımtepe

Hi, > What you get from DictVectorizer is a sparse matrix containing one-hot > coded categorical values (booleans). Random forests don't support > those, but fortunately they (should) handle categorical values without > one-hot coding, so you do something like > > I tried with string values and

Re: [Scikit-learn-general] cross-validation and indices=False

2013-07-31 Thread Jaques Grobler

Makes sense to me to deprecate here +1 2013/7/31 Olivier Grisel > +1 for deprecating boolean mask for CV as well. > > > -- > Get your SQL database under version control now! > Version control is standard for application

Re: [Scikit-learn-general] cross-validation and indices=False

2013-07-31 Thread Olivier Grisel

+1 for deprecating boolean mask for CV as well. -- Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your

Re: [Scikit-learn-general] cross-validation and indices=False

2013-07-31 Thread Gael Varoquaux

On Wed, Jul 31, 2013 at 09:14:15AM +1000, Joel Nothman wrote: > What is the intention behind indices=False; Old design oversight (aka historical reasons). > why not deprecate it and simplify the API and code? (And speed up > indexing by using np.take.) +1! Making things simpler is always better.

Re: [Scikit-learn-general] cross-validation and indices=False

2013-07-31 Thread Alexandre Gramfort

hi, indeed we could stick to indices and use np.take whenever possible. In [33]: A = np.random.randn(500, 500) In [34]: idx = np.unique(np.random.randint(0, 499, 400)) In [35]: mask = np.zeros(500, dtype=np.bool) In [36]: mask[idx] = True In [37]: %timeit A[idx] 1000 loops, best of 3: 1.79 ms per

Re: [Scikit-learn-general] problem with parallel computing on windows xp

Re: [Scikit-learn-general] Scikit-learn alpha release

Re: [Scikit-learn-general] Identical scores across repetitions of repeated CV ?? (figure included)

[Scikit-learn-general] Identical scores across repetitions of repeated CV ?? (figure included)

[Scikit-learn-general] problem with parallel computing on windows xp

[Scikit-learn-general] Scikit-learn alpha release

Re: [Scikit-learn-general] GridSearchCV with multi-label: ROC-AUC-equivalent metrics

Re: [Scikit-learn-general] random forest string data

Re: [Scikit-learn-general] random forest string data

Re: [Scikit-learn-general] random forest string data

Re: [Scikit-learn-general] random forest string data

Re: [Scikit-learn-general] cross-validation and indices=False

Re: [Scikit-learn-general] cross-validation and indices=False

Re: [Scikit-learn-general] cross-validation and indices=False

Re: [Scikit-learn-general] cross-validation and indices=False

15 matches

Site Navigation

Mail list logo

Footer information