Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Olivier Grisel
Both pip and easy_installé build numpy and SciPy from source under linux. -- Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports Are you A

Re: [Scikit-learn-general] Inputer, python list and strings

2014-09-25 Thread Vlad Niculae
Hi Zoraida, The Imputer assumes that your data is a numeric numpy array, or convertible to one. You should replace your string "NA" values with np.nan objects, then use the Imputer with the default, `missing_values='NaN'`. It's easier to debug if you explicitly convert your data to a float numpy

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Pagliari, Roberto
Thanks a lot for your help. I will try with Anaconda. If not with yum, I must have used easy_install or pip. I definitely did not build it from source. Thank you, From: Sergio Pascual [mailto:sergio.pa...@gmail.com] Sent: Thursday, September 25, 2014 12:33 PM To: scikit-learn-general@lists.sou

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Sergio Pascual
2014-09-25 18:01 GMT+02:00 Pagliari, Roberto : > Here is it is > > > > numpy-1.4.1-9.el6.x86_64 > > package scipy is not installed > > > strangely it is saying scipy is not installed, but I did install it and I > can import it in python.. > > > So you installed scipy, but not from an RPM/yum. Som

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Andy
On 09/25/2014 06:01 PM, Pagliari, Roberto wrote: Here is it is numpy-1.4.1-9.el6.x86_64 package scipy is not installed strangely it is saying scipy is not installed, but I did install it and I can import it in python.. That is why I asked how you installed scipy and to import it ;) You ca

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Pagliari, Roberto
Here is it is numpy-1.4.1-9.el6.x86_64 package scipy is not installed strangely it is saying scipy is not installed, but I did install it and I can import it in python.. From: Sergio Pascual [mailto:sergio.pa...@gmail.com] Sent: Thursday, September 25, 2014 11:46 AM To: scikit-learn-general@li

Re: [Scikit-learn-general] feature_importances_ from gridsearchCV

2014-09-25 Thread Pagliari, Roberto
I did not know I could do clf.best_estimator_.feature_importances_ if it works, that's what I was looking for.. :) thank you, -Original Message- From: Andy [mailto:t3k...@gmail.com] Sent: Thursday, September 25, 2014 11:38 AM To: scikit-learn-general@lists.sourceforge.net Subject: Re

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Sergio Pascual
2014-09-25 17:17 GMT+02:00 Pagliari, Roberto : > Sorry, I've got 0.14 and in fact the following warning > > UserWarning: Numpy 1.5.1 or above is recommended for this version of scipy > (detected version 1.4.1) > > The current version of scipy in Centos 6 is 0.7.2 Could you do ? $ rpm -q numpy $

[Scikit-learn-general] Inputer, python list and strings

2014-09-25 Thread ZORAIDA HIDALGO SANCHEZ
Hi all, I am having problems when trying to deal with missing values. I am using Imputer like this: Pipeline([('imputerNA', Imputer(missing_values='NA', strategy='mean', axis=0, verbose=4)), ('minmax', MinMaxScaler())]))] My data looks like this: 24881956.0|NA|1840.0|NA|NA|48.0|1.4|NA|-1.0|0.0|

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Pagliari, Roberto
No, That was the warning I got when importing scipy. If I try to import sklearn I get this /usr/lib64/python2.6/site-packages/scipy/__init__.py:120: UserWarning: Numpy 1.5.1 or above is recommended for this version of scipy (detected version 1.4.1) UserWarning) RuntimeError: module compiled a

Re: [Scikit-learn-general] feature_importances_ from gridsearchCV

2014-09-25 Thread Andy
On 09/25/2014 05:30 PM, Pagliari, Roberto wrote: > I just printed both best_estimator and best_parameters, but I'm not getting > the feature importance.. Can you elaborate? As Gael said, you are looking for best_estimator_.feature_importances_ What do you mean by not getting the feature importan

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Andy
On 09/25/2014 05:17 PM, Pagliari, Roberto wrote: > Sorry, I've got 0.14 and in fact the following warning That's why I asked. If these are the packages provided by yum, that is quite odd. Was the incompatible numpy error you got below from scipy or scikit-learn? ---

Re: [Scikit-learn-general] feature_importances_ from gridsearchCV

2014-09-25 Thread Pagliari, Roberto
I just printed both best_estimator and best_parameters, but I'm not getting the feature importance.. -Original Message- From: Gael Varoquaux [mailto:gael.varoqu...@normalesup.org] Sent: Thursday, September 25, 2014 11:25 AM To: scikit-learn-general@lists.sourceforge.net Subject: Re: [Sci

Re: [Scikit-learn-general] feature_importances_ from gridsearchCV

2014-09-25 Thread Gael Varoquaux
> What would be the difference between best_estimator and best_params? best_params are the parameters for best_estimator -- Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer Achieve PCI DSS 3.0 Compliant Sta

Re: [Scikit-learn-general] feature_importances_ from gridsearchCV

2014-09-25 Thread Pagliari, Roberto
Thank you. What would be the difference between best_estimator and best_params? Thanks, -Original Message- From: Gael Varoquaux [mailto:gael.varoqu...@normalesup.org] Sent: Thursday, September 25, 2014 11:09 AM To: scikit-learn-general@lists.sourceforge.net Subject: Re: [Scikit-learn-

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Pagliari, Roberto
Sorry, I've got 0.14 and in fact the following warning serWarning: Numpy 1.5.1 or above is recommended for this version of scipy (detected version 1.4.1) -Original Message- From: Andy [mailto:t3k...@gmail.com] Sent: Thursday, September 25, 2014 11:07 AM To: scikit-learn-general@lists.s

Re: [Scikit-learn-general] feature_importances_ from gridsearchCV

2014-09-25 Thread Gael Varoquaux
On Thu, Sep 25, 2014 at 03:06:15PM +, Pagliari, Roberto wrote: > the object clf will not have feature_importances_. Is that embedded in > best_estimator? Yes: best_estimator_.feature_importances_ G -- Meet PCI DSS 3.

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Andy
On 09/25/2014 05:02 PM, Pagliari, Roberto wrote: > Hi, > Via yum I got 1.4. for both libraries. Scipy is at 0.14.0 currently. > > > Thanks, > > > -Original Message- > From: Andy [mailto:t3k...@gmail.com] > Sent: Thursday, September 25, 2014 10:59 AM > To: scikit-learn-general@lists.source

[Scikit-learn-general] feature_importances_ from gridsearchCV

2014-09-25 Thread Pagliari, Roberto
When using GridSearchCV with random forests, is there a way to get the feature_importances_? For example, with a code like this parameters = { } clf = grid_search.GridSearchCV(RandomForestClassifier(), param_grid=parameters) clf.fit(X, y) the object clf will not have feature_impo

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Pagliari, Roberto
Hi, Via yum I got 1.4. for both libraries. Thanks, -Original Message- From: Andy [mailto:t3k...@gmail.com] Sent: Thursday, September 25, 2014 10:59 AM To: scikit-learn-general@lists.sourceforge.net Subject: Re: [Scikit-learn-general] sklearn on CentOS On 09/25/2014 03:17 PM, Paglia

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Andy
On 09/25/2014 03:17 PM, Pagliari, Roberto wrote: > Hi All, > I used yum to install numpy and scipy. I stried with CentOS for external > constraints. I will try to use, at least, a more recent version. That is odd. I think you should try to go with anaconda (or canopy). If you want to pursue the

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Pagliari, Roberto
I will give it a try. Thank you! -Original Message- From: Aaron O'Leary [mailto:aaron.ole...@gmail.com] Sent: Thursday, September 25, 2014 9:42 AM To: scikit-learn-general@lists.sourceforge.net Subject: Re: [Scikit-learn-general] sklearn on CentOS Hi Roberto, I develop on CentOS 5.5 an

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Aaron O'Leary
Hi Roberto, I develop on CentOS 5.5 and use Anaconda with no issues. Enthought Canopy also works fine. Give anaconda a try, as it is easy to get set up: http://docs.continuum.io/anaconda/install.html Otherwise, I'd avoid using yum and just use pip to install the python packages that you want, i

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Pagliari, Roberto
Hi All, I used yum to install numpy and scipy. I stried with CentOS for external constraints. I will try to use, at least, a more recent version. Thank you, -Original Message- From: Kyle Kastner [mailto:kastnerk...@gmail.com] Sent: Thursday, September 25, 2014 9:14 AM To: scikit-lear

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Kyle Kastner
To be honest - updating python packages on CentOS is a nightmare. The whole OS is pretty strongly dependent on python version, which I believe is up to 2.6 now (2.4 in 5.x!). In my experience CentOS is the worst Linux OS for development (heavily locked down, hard to add packages, yum is annoying, e

Re: [Scikit-learn-general] cross validation with random forests

2014-09-25 Thread Andy
On 09/23/2014 11:50 PM, Pagliari, Roberto wrote: I’m a bit confused as to why gridsearchCV is not needed with random forests. I understand that with RF, each tree will only get to see a partial representation of the data. Why do you say GridSearchCV is not needed? I think it should always b

Re: [Scikit-learn-general] sklearn on CentOS

2014-09-25 Thread Andy
Hi Roberto. How are you trying to install scikit-learn, and how did you install scipy and numpy? There is a mismatch in the numpy and scipy you installed. I couldn't find a list of packages of CentOS packages online. CentOS 6.5 seems pretty out of date (Python2.6), and the version of numpy you