Re: [Scikit-learn-general] Spark-backed implementations of scikit-learn estimators

2013-12-04 Thread Mathieu Blondel
It seems to me that an utility script to convert an svmlight file to an RDD could be useful. Mathieu On Thu, Dec 5, 2013 at 4:29 AM, Peter Prettenhofer < peter.prettenho...@gmail.com> wrote: > Great news - looking forward to the outcome of the sprint! > > > 2013/12/4 Olivier Grisel > >> I mean

[Scikit-learn-general] Decision Tree Classification of Raster Data using scikit-learn

2013-12-04 Thread Jibran Khan
Hello, I am using Scikit-learn module of Python for classification analysis of my data. I am using MODIS satellite sensor image (raster data in .hdf format having 1-7 spectral bands) and I actually need to perform decision tree classification of this dataset. I have converted the raster data into

Re: [Scikit-learn-general] Spark-backed implementations of scikit-learn estimators

2013-12-04 Thread Peter Prettenhofer
Great news - looking forward to the outcome of the sprint! 2013/12/4 Olivier Grisel > I meant San Francisco... > > -- > Olivier > > > -- > Sponsored by Intel(R) XDK > Develop, test and display web and hybrid apps with a

Re: [Scikit-learn-general] Spark-backed implementations of scikit-learn estimators

2013-12-04 Thread Olivier Grisel
I meant San Francisco... -- Olivier -- Sponsored by Intel(R) XDK Develop, test and display web and hybrid apps with a single code base. Download it for free now! http://pubads.g.doubleclick.net/gampad/clk?id=111408631&i

Re: [Scikit-learn-general] Spark-backed implementations of scikit-learn estimators

2013-12-04 Thread Olivier Grisel
Cloudera offered to host the sprint in San Fransisco. I created a page on the wiki. Please feel free to register if you would like to participate: https://github.com/scikit-learn/scikit-learn/wiki/Upcoming-events -- Sponso

Re: [Scikit-learn-general] classification on object array

2013-12-04 Thread Olivier Grisel
2013/12/4 abhishek : > hi Olivier, > > Thanks for the reply. > > In my case each row of X contains two normal distributions (one 1-D and > second 2-D). > So a row of X looks like this : [ [mean1(1x1)] [variance1(1x1)] [mean2 > (1x2)] [variance2(2x2)] ] > In case of normal distributions, do you thin

Re: [Scikit-learn-general] from sklearn.all import *

2013-12-04 Thread josef . pktd
On Wed, Dec 4, 2013 at 9:58 AM, Olivier Grisel wrote: > As a user I must confess that I like the flat numpy API, both in > interactive sessions and in regular code. The main con is that it's > often hard to find the source code of a particular class or function, > especially when it's a builtin ob

Re: [Scikit-learn-general] classification on object array

2013-12-04 Thread abhishek
hi Olivier, Thanks for the reply. In my case each row of X contains two normal distributions (one 1-D and second 2-D). So a row of X looks like this : [ [mean1(1x1)] [variance1(1x1)] [mean2 (1x2)] [variance2(2x2)] ] In case of normal distributions, do you think features will be preserved if i fla

Re: [Scikit-learn-general] classification on object array

2013-12-04 Thread Olivier Grisel
2013/12/4 abhishek : > Hello everyone, > > Im familiar with the scikit-learn classifiers and have used them a lot of > times in some research. The problem I'm facing right now is my data is in > the form of numpy object array. > > For example X is: > > X = [ [1,2] [2,5] [[1,2],[3,4]], > [

Re: [Scikit-learn-general] Contributing with Kernel partial least squares

2013-12-04 Thread Olivier Grisel
Thanks! Could you also run a quick grid search to fine tune gamma and C for all models independently? Another question: what is the time complexity for KPLS w.r.t. n_samples? quadatratic, cubic? Could you please extend your gist to randomly subsample the digits dataset to only keep 50% and re-run

[Scikit-learn-general] classification on object array

2013-12-04 Thread abhishek
Hello everyone, Im familiar with the scikit-learn classifiers and have used them a lot of times in some research. The problem I'm facing right now is my data is in the form of numpy object array. For example X is: X = [ [1,2] [2,5] [[1,2],[3,4]], [2,4] [54,52] [[11,22],[13,4]],

Re: [Scikit-learn-general] from sklearn.all import *

2013-12-04 Thread Olivier Grisel
As a user I must confess that I like the flat numpy API, both in interactive sessions and in regular code. The main con is that it's often hard to find the source code of a particular class or function, especially when it's a builtin object from a CPython extension. Fortunately in our case, most of

Re: [Scikit-learn-general] Contributing with Kernel partial least squares

2013-12-04 Thread abdalrahman eweiwi
Hi, Here is a public gist of KPLS, https://gist.github.com/abdhk383/7788156 Regards, Eweiwi On Tue, Dec 3, 2013 at 12:51 PM, abdalrahman eweiwi < abdalrahman.ewe...@gmail.com> wrote: > > Hi, > > Here is a preliminary results on classification performance of KPLS using > a 20 fold cross validat

Re: [Scikit-learn-general] Handle sparse data on Instance Reduction

2013-12-04 Thread Olivier Grisel
Hi, indeed the generic exception catching / reraising of the test common stuff is not very helpful. You can add a test in you own test suite to check where it breaks in your code: import scipy.sparse as sp X_train_csr = sp.csr_matrix(X_train) X_test_csr = sp.csr_matrix(X_test) model = MyModel().