Re: [Scikit-learn-general] Fuzzy (K or C) Means algorithm in sklearn?

2013-09-26 Thread Mathieu Blondel
Hi Kyle, No, it's not in scikit-learn yet. We do have GMMs, though, which are another way to do clustering with soft memberships. Mathieu On Thu, Sep 26, 2013 at 5:14 AM, Kyle Kastner kastnerk...@gmail.com wrote: I looked around, but was unable to find a fuzzy clustering algorithm in

Re: [Scikit-learn-general] NIPS's Machine Learning Open Source Software workshop

2013-09-26 Thread Olivier Grisel
2013/9/26 Gael Varoquaux gael.varoqu...@normalesup.org: On Mon, Sep 16, 2013 at 09:19:23AM +0200, Gilles Louppe wrote: So basically, do we agree that the goal of our proposal to this workshop will only be to further promote the project in the scientific community? Sounds good to me :)

Re: [Scikit-learn-general] Right place for a time-series focused algorithm?

2013-09-26 Thread Olivier Grisel
2013/9/25 Peter Prettenhofer peter.prettenho...@gmail.com: Hi Kyle, personally, I'd love to see SAX in sklearn or any other python library that I could easily use with sklearn. We don't have any time-series specific functionality yet (eg. lagged features transformer). So if we choose to add

[Scikit-learn-general] fit GMM on 2D data

2013-09-26 Thread Anna Bos
Hi. I try to reproduce the best fitting plot from http://www.astroml.org/book_figures/chapter4/fig_GMM_1D.html on my data. I have sample data, in stead of histogram data. I have the following code, but get an error because the dimension of my data is 2. Could anybody help me with this? My data

[Scikit-learn-general] error using CountVectorizer with python 2.6.1

2013-09-26 Thread Akhil Shah
Hi, I've noticed that I get an error using CountVectorizer (or any class that uses fit_transform) in python 2.6.1 with an unspecified vocabulary, as the underlying defaultdict method cannot take 'None'. Is there a fix or do I just need a more recent install of python? Thanks, Akhil

[Scikit-learn-general] Nose Test error

2013-09-26 Thread Jonathan Suit
When I run: nosetests sklearn --exe I get: /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/utils.py:1132: DeprecationWarning: The compiler package is deprecated and removed in Python 3.x. import compiler

Re: [Scikit-learn-general] dev website rebuild

2013-09-26 Thread Fabian Pedregosa
The docs are now written to $previous_path + /stable/, and this broke my upload script. I've updated my script and it should be fixed now. Fabian On Tue, Sep 3, 2013 at 4:41 PM, Fabian Pedregosa fabian.pedreg...@inria.frwrote: I think it still runs on my machine. I'm looking into it ... On

[Scikit-learn-general] adding correlation functions for Gaussian Process

2013-09-26 Thread Pieter Savenberg
Hello, I wondered if it's possible to add my own correlation functions in the SK learn GP class? I'm working on a thesis revolving the stock market and non-stationary time series so I'd like to use other correlation functions than the basic ones that are provided in SK Learn. Can I maybe add

[Scikit-learn-general] SCIKIT FOR AUDIO PROCESSING

2013-09-26 Thread Sundeep Sivan
Hi, I am an M.Tech student. I am doing my main project in speaker recognition. ie..recognizing people from their voice. And i will consider python as my software. My project has a training phase. HMM,ANN,GMM,SVM,DTW can be used for feature matching in speaker recognition, My question is that can i

Re: [Scikit-learn-general] TF-IDF and LSI

2013-09-26 Thread Olivier Grisel
2013/9/7 Tasos Ventouris tasosventou...@hotmail.com: Hello, I have to questions where I would like your feedback. The first one: Here is my code: from sklearn.feature_extraction.text import TfidfVectorizer documents = [doc1,doc2,doc3] tfidf = TfidfVectorizer().fit_transform(documents)

Re: [Scikit-learn-general] TF-IDF and LSI

2013-09-26 Thread Lars Buitinck
2013/9/26 Olivier Grisel olivier.gri...@ensta.org: 2013/9/7 Tasos Ventouris tasosventou...@hotmail.com: I tried to run my script and then create a string from the list for each text and inlcude those texts into the TfidfVectorizer. I am satisfied from the results, but unfortunately, if I have

Re: [Scikit-learn-general] fit GMM on 2D data

2013-09-26 Thread Yogesh Karpate
I think you have trained the model on 2d data and evaluating best model on 1d data. It should be logprob, responsibilities = M_best.eval(x) should be conataing X not x because X is 2d and x is 1d On Tue, Sep 17, 2013 at 11:21 AM, Anna Bos anna.n@gmail.com wrote: Hi. I try to reproduce

Re: [Scikit-learn-general] SCIKIT FOR AUDIO PROCESSING

2013-09-26 Thread Olivier Grisel
2013/9/17 Sundeep Sivan sundeepsi...@gmail.com: Hi, I am an M.Tech student. I am doing my main project in speaker recognition. ie..recognizing people from their voice. And i will consider python as my software. My project has a training phase. HMM,ANN,GMM,SVM,DTW can be used for feature

Re: [Scikit-learn-general] SCIKIT FOR AUDIO PROCESSING

2013-09-26 Thread Yogesh Karpate
HMM is widely used in speaker recognition. You can explore HMM. On Tue, Sep 17, 2013 at 4:03 PM, Sundeep Sivan sundeepsi...@gmail.comwrote: Hi, I am an M.Tech student. I am doing my main project in speaker recognition. ie..recognizing people from their voice. And i will consider python as my

Re: [Scikit-learn-general] Right place for a time-series focused algorithm?

2013-09-26 Thread Olivier Grisel
2013/9/26 Kyle Kastner kastnerk...@gmail.com: I had not thought about use inside a Pipeline - though now that you mention it, that seems like the ideal use case for an algorithm like this. Is this the PR you mentioned? https://github.com/scikit-learn/scikit-learn/pull/1454 Yes but because of

Re: [Scikit-learn-general] error using CountVectorizer with python 2.6.1

2013-09-26 Thread Olivier Grisel
2013/9/16 Akhil Shah akhil...@gmail.com: Hi, I've noticed that I get an error using CountVectorizer (or any class that uses fit_transform) in python 2.6.1 with an unspecified vocabulary, as the underlying defaultdict method cannot take 'None'. Is there a fix or do I just need a more recent

Re: [Scikit-learn-general] Right place for a time-series focused algorithm?

2013-09-26 Thread Peter Prettenhofer
2013/9/26 Kyle Kastner kastnerk...@gmail.com I had not thought about use inside a Pipeline - though now that you mention it, that seems like the ideal use case for an algorithm like this. Is this the PR you mentioned? https://github.com/scikit-learn/scikit-learn/pull/1454 As far as lagged

Re: [Scikit-learn-general] adding correlation functions for Gaussian Process

2013-09-26 Thread Olivier Grisel
2013/9/20 Pieter Savenberg pieter.savenb...@ugent.be: Hello, I wondered if it's possible to add my own correlation functions in the SK learn GP class? I'm working on a thesis revolving the stock market and non-stationary time series so I'd like to use other correlation functions than the

Re: [Scikit-learn-general] Right place for a time-series focused algorithm?

2013-09-26 Thread Olivier Grisel
2013/9/26 Peter Prettenhofer peter.prettenho...@gmail.com: 2013/9/26 Kyle Kastner kastnerk...@gmail.com I had not thought about use inside a Pipeline - though now that you mention it, that seems like the ideal use case for an algorithm like this. Is this the PR you mentioned?

Re: [Scikit-learn-general] fit GMM on 2D data

2013-09-26 Thread Jacob Vanderplas
eval() expects data of the same dimension as the fit. Your fit data is shape (48, 2), which is interpreted as 48 points in 2 dimensions. Your eval data is shape (48,) which scikit-learn cannot interpret as (n_samples, n_features). If you fit the model on two-dimensional data, you must call eval

Re: [Scikit-learn-general] error using CountVectorizer with python 2.6.1

2013-09-26 Thread Lars Buitinck
2013/9/26 Olivier Grisel olivier.gri...@ensta.org: I have no problem with 2.6.7: I tried with 2.6.1, 2.6.6 and 2.6.8. Only the first has a problem. As I already said on SO, this is probably a bug in Python 2.6.1, but it should be easy to fix: just construct the defaultdict from some callable,