[Scikit-learn-general] Classificator for probability features

2012-05-14 Thread Philipp Singer
Hey there! I am currently trying to classify a dataset which has the following format: Class1 0.3 0.5 0.2 Class2 0.9 0.1 0.0 ... So the features are probabilities that sum always up at exactly 1. I have tried several linear classifiers but I am now wondering if there is maybe some better way

Re: [Scikit-learn-general] Classificator for probability features

2012-05-14 Thread amueller
I would try using a chi squared Kernel. You can Start by using the approximation provided in sklearn. Cheers, andy -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. Philipp Singer kill...@gmail.com schrieb: Hey there! I am currently trying to classify a dataset

Re: [Scikit-learn-general] Classificator for probability features

2012-05-14 Thread Peter Prettenhofer
Hi Philipp, you could try a nearest neighbors approach and use KL-divergence as your distance metric** best, Peter ** KL-divergence is not a proper metric but it might work 2012/5/14 amuel...@ais.uni-bonn.de: I would try using a chi squared Kernel. You can Start by using the approximation

Re: [Scikit-learn-general] Get TF-IDF mapped with associated word vector

2012-05-14 Thread JAGANADH G
On Fri, May 11, 2012 at 3:06 PM, Olivier Grisel olivier.gri...@ensta.orgwrote: 2012/5/10 JAGANADH G jagana...@gmail.com: Hi all Is there any way to get the TF-IDF value mapped with the word vector in sklearn. I would like to get output like w1 - TF-IDF w2 - TF-IDF TF is

[Scikit-learn-general] Implementation Question: safe_asarray()

2012-05-14 Thread Daniel Duckworth
Hi everyone, For those familiar with this function in sklearn/utils/validation.py, I was wondering why sparse matrices are passed through silently without respecting the `dtype` or `order` arguments. I can understand why one would want to ignore `order` due to how sparse matrices are designed,

[Scikit-learn-general] linear discriminant analysis on text data

2012-05-14 Thread JAGANADH G
Hi All Is it possible to apply linear discriminant analysis in text data ? If so how can I prepare the data for same If it is a dumb question forgive Thanks in advance -- ** JAGANADH G http://jaganadhg.in *ILUGCBE* http://ilugcbe.org.in

Re: [Scikit-learn-general] Classificator for probability features

2012-05-14 Thread David Warde-Farley
On Mon, May 14, 2012 at 05:00:54PM +0200, Philipp Singer wrote: Thanks, that sounds really promising. Is there an implementation of KL divergence in scikit-learn? If so, how can I directly use that? I don't believe there is, but it's quite simple to do yourself. Many algorithms in

Re: [Scikit-learn-general] Classificator for probability features

2012-05-14 Thread Philipp Singer
Thanks a lot for the explanation. So do I see this right, that I would need to calculate for each pair of feature vectors the KL divergence? I have already tried to use a pipeline calculating an additive chi squared followed by a linear SVC. This boosts my results a bit. But I am still

[Scikit-learn-general] Interest in State Space Models

2012-05-14 Thread Daniel Duckworth
Hello everyone, I noticed that scikit-learn (and Python in general) seems to be missing a decent module for State Space Models. State Space Models are a type of generative model wherein one attempts to estimate the hidden state of a system given a sequence of noisy observations.Observations

Re: [Scikit-learn-general] Interest in State Space Models

2012-05-14 Thread Skipper Seabold
On Mon, May 14, 2012 at 4:55 PM, Daniel Duckworth duckwor...@gmail.com wrote: Hello everyone, I noticed that scikit-learn (and Python in general) seems to be missing a decent module for State Space Models.  State Space Models are a type of generative model wherein one attempts to estimate the

[Scikit-learn-general] multilayer perceptron questions

2012-05-14 Thread David Marek
Hi, I have worked on multilayer perceptron and I've got a basic implementation working. You can see it at https://github.com/davidmarek/scikit-learn/tree/gsoc_mlp The most important part is the sgd implementation, which can be found here

Re: [Scikit-learn-general] linear discriminant analysis on text data

2012-05-14 Thread Robert Layton
On 15 May 2012 04:20, JAGANADH G jagana...@gmail.com wrote: Hi All Is it possible to apply linear discriminant analysis in text data ? If so how can I prepare the data for same If it is a dumb question forgive Thanks in advance -- ** JAGANADH G

[Scikit-learn-general] for multilabel classification is it necessary to train all combinations of the labels in the training set? Is there any way to do without training for all combinations?

2012-05-14 Thread Bilal Allawala
Hi I am trying to classify text by places. A piece of text can be belong to one or more places. My code (attached below) returns: nice day in nyc = new york welcome to london = london hello welcome to new york. It has theaters like london = new york, london but if i take out the london and new