Re: [Scikit-learn-general] Classificator for probability features

2012-05-14 Thread amueller
I would try using a chi squared Kernel. You can Start by using the 
approximation provided in sklearn.
Cheers, andy
-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.



Philipp Singer kill...@gmail.com schrieb:

Hey there!

I am currently trying to classify a dataset which has the following format:

Class1 0.3 0.5 0.2
Class2 0.9 0.1 0.0
...

So the features are probabilities that sum always up at exactly 1.

I have tried several linear classifiers but I am now wondering if there 
is maybe some better way to classify such data and achieve better results.

Maybe someone has some ideas.

Thanks and regards,
Philipp

_

Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_

Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


Re: [Scikit-learn-general] Classificator for probability features

2012-05-14 Thread Peter Prettenhofer
Hi Philipp,

you could try a nearest neighbors approach and use KL-divergence as
your distance metric**

best,
 Peter

** KL-divergence is not a proper metric but it might work

2012/5/14  amuel...@ais.uni-bonn.de:
 I would try using a chi squared Kernel. You can Start by using the
 approximation provided in sklearn.
 Cheers, andy
 --
 Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.



 Philipp Singer kill...@gmail.com schrieb:

 Hey there!

 I am currently trying to classify a dataset which has the following
 format:

 Class1 0.3 0.5 0.2
 Class2 0.9 0.1 0.0
 ...

 So the features are probabilities that sum always up at exactly 1.

 I have tried several linear classifiers but I am now wondering if there
 is maybe some better way to classify such data and achieve better results.

 Maybe someone has some ideas.

 Thanks and regards,
 Philipp

 

 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 

 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Scikit-learn-general mailing list
 Scikit-learn-general@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




-- 
Peter Prettenhofer

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


Re: [Scikit-learn-general] Classificator for probability features

2012-05-14 Thread David Warde-Farley
On Mon, May 14, 2012 at 05:00:54PM +0200, Philipp Singer wrote:
 Thanks, that sounds really promising.
 
 Is there an implementation of KL divergence in scikit-learn? If so, how can I 
 directly use that?

I don't believe there is, but it's quite simple to do yourself. Many
algorithms in scikit-learn can take a precomputed distance matrix.

Given two points, p and q, on the simplex, the KL divergence between the two
discrete distributions represented is simply (-p * np.log(p / q)).sum(). Note
that this is in general not defined if they do not share the same support
(i.e. if there is a zero at one spot in one but not in the other). In
practice, if there are any zeros at all, you will need to deal with them
clearly as the logarithm and/or the division will misbehave.

Note that the grandparent's note that the KL divergence is not a metric is
not a minor concern: the KL divergence, for example, is _not_ symmetric
(KL(p, q) != KL(q, p)).  You can of course take the average of KL(p, q) and
KL(q, p) to symmetrize it, but you still may run into problems with
algorithms that assume that distances obey the triangle inequality (KL
divergences do not).

Personally I would recommend trying Andy's suggestion re: an SVM with a
chi-squared kernel. For small instances you can precompute the kernel
matrix and pass it to SVC yourself. If you have a lot of data (or if you want
to try it out quickly) the kernel approximations module plus a linear SVM
is a good bet.

David

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


Re: [Scikit-learn-general] Classificator for probability features

2012-05-14 Thread Philipp Singer
Thanks a lot for the explanation.

So do I see this right, that I would need to calculate for each pair of 
feature vectors the KL divergence?

I have already tried to use a pipeline calculating an additive chi 
squared followed by a linear SVC. This boosts my results a bit. But I am 
still staying at an f1 score of 0.25 and I want to improve this if 
possible. Is this the right way to do this?

Maybe there are some tweaks intended, like changing the parameters etc.

Sorry for the dumb questions, but I haven't used on of these methods 
until now. Still excited to learn more about that ;)

Regards,
Philipp

Am 14.05.2012 21:18, schrieb David Warde-Farley:
 On Mon, May 14, 2012 at 05:00:54PM +0200, Philipp Singer wrote:
 Thanks, that sounds really promising.

 Is there an implementation of KL divergence in scikit-learn? If so, how can 
 I directly use that?
 I don't believe there is, but it's quite simple to do yourself. Many
 algorithms in scikit-learn can take a precomputed distance matrix.

 Given two points, p and q, on the simplex, the KL divergence between the two
 discrete distributions represented is simply (-p * np.log(p / q)).sum(). Note
 that this is in general not defined if they do not share the same support
 (i.e. if there is a zero at one spot in one but not in the other). In
 practice, if there are any zeros at all, you will need to deal with them
 clearly as the logarithm and/or the division will misbehave.

 Note that the grandparent's note that the KL divergence is not a metric is
 not a minor concern: the KL divergence, for example, is _not_ symmetric
 (KL(p, q) != KL(q, p)).  You can of course take the average of KL(p, q) and
 KL(q, p) to symmetrize it, but you still may run into problems with
 algorithms that assume that distances obey the triangle inequality (KL
 divergences do not).

 Personally I would recommend trying Andy's suggestion re: an SVM with a
 chi-squared kernel. For small instances you can precompute the kernel
 matrix and pass it to SVC yourself. If you have a lot of data (or if you want
 to try it out quickly) the kernel approximations module plus a linear SVM
 is a good bet.

 David


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general