Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread xinfan meng
On Fri, Jan 6, 2012 at 7:00 AM, Olivier Grisel wrote: > 2012/1/5 Fabian Pedregosa : > > On Thu, Jan 5, 2012 at 11:30 PM, Gael Varoquaux > > wrote: > >> On Thu, Jan 05, 2012 at 11:28:45PM +0100, Andreas wrote: > >>> As I said, I don't have multiple versions and the only > >>> thing that fails is s

Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread Vlad Niculae
On Jan 5, 2012, at 23:45 , Fabian Pedregosa wrote: > and that was > quite convenient for testing on systems on which nosetest fails > (windows). Hi Fabian Could you please be more specific regarding this point, since as a former Windows user, I find that I don't know what you mean. On topic, I

Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread Olivier Grisel
2012/1/5 Fabian Pedregosa : > On Thu, Jan 5, 2012 at 11:30 PM, Gael Varoquaux > wrote: >> On Thu, Jan 05, 2012 at 11:28:45PM +0100, Andreas wrote: >>> As I said, I don't have multiple versions and the only >>> thing that fails is sklearn.test(). >> >> OK, so let's move the warning there. > > +1. R

Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread Fabian Pedregosa
On Thu, Jan 5, 2012 at 11:30 PM, Gael Varoquaux wrote: > On Thu, Jan 05, 2012 at 11:28:45PM +0100, Andreas wrote: >> As I said, I don't have multiple versions and the only >> thing that fails is sklearn.test(). > > OK, so let's move the warning there. +1. Raising an exception is also OK and might

Re: [Scikit-learn-general] as_float_array with sparse matrices

2012-01-05 Thread Lars Buitinck
2012/1/5 Olivier Grisel : > We could rename it "as_float_data" instead. I have no strong opinion on this. Hmm... well, as it's internal, we might as well leave it as is. -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam

Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread Gael Varoquaux
On Thu, Jan 05, 2012 at 11:28:45PM +0100, Andreas wrote: > As I said, I don't have multiple versions and the only > thing that fails is sklearn.test(). OK, so let's move the warning there. G -- Ridiculously easy VDI. Wit

Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread Andreas
On 01/05/2012 10:45 PM, Fabian Pedregosa wrote: > On Thu, Jan 5, 2012 at 11:47 AM, Andreas > wrote: >> On 01/05/2012 07:37 AM, Gael Varoquaux wrote: >>> On Wed, Jan 04, 2012 at 03:38:41PM -0800, Jacob VanderPlas wrote: >>> Importing sklearn from within the scikit-learn source directory

Re: [Scikit-learn-general] as_float_array with sparse matrices

2012-01-05 Thread Olivier Grisel
2012/1/5 Lars Buitinck : > 2012/1/5 Gael Varoquaux : >> On Thu, Jan 05, 2012 at 10:40:20PM +0100, Gael Varoquaux wrote: >>> Right now, when a sparse matrix is given at the validation utility >>> 'as_float_array', it crashes with the following incomprehensible error >>> message: >> >> Alright, it's

Re: [Scikit-learn-general] as_float_array with sparse matrices

2012-01-05 Thread Gael Varoquaux
On Thu, Jan 05, 2012 at 11:17:34PM +0100, Lars Buitinck wrote: > > Alright, it's not what I thought. Forget this message, apparently this > > incomprehensible error message is raised for another reason. > But I'm still surprised at the behavior: > In [1]: from sklearn.utils import as_float_array

Re: [Scikit-learn-general] KMeans implementation in C with OpenMP

2012-01-05 Thread Fabian Pedregosa
On Thu, Jan 5, 2012 at 11:14 PM, Lars Buitinck wrote: > 2012/1/5 Fabian Pedregosa : >> Good look with the building issues. A safe solution would be to do >> conditional compilation. Scipy does it and we use the same build >> system, unfortunately it's not as easy as it sounds :-). > > I hope you m

Re: [Scikit-learn-general] as_float_array with sparse matrices

2012-01-05 Thread Lars Buitinck
2012/1/5 Gael Varoquaux : > On Thu, Jan 05, 2012 at 10:40:20PM +0100, Gael Varoquaux wrote: >> Right now, when a sparse matrix is given at the validation utility >> 'as_float_array', it crashes with the following incomprehensible error >> message: > > Alright, it's not what I thought. Forget this m

Re: [Scikit-learn-general] KMeans implementation in C with OpenMP

2012-01-05 Thread Lars Buitinck
2012/1/5 Fabian Pedregosa : > Good look with the building issues. A safe solution would be to do > conditional compilation. Scipy does it and we use the same build > system, unfortunately it's not as easy as it sounds :-). I hope you mean just passing compiler options optionally; you don't need co

Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread Fabian Pedregosa
On Thu, Jan 5, 2012 at 11:47 AM, Andreas wrote: > On 01/05/2012 07:37 AM, Gael Varoquaux wrote: >> On Wed, Jan 04, 2012 at 03:38:41PM -0800, Jacob VanderPlas wrote: >> >>> Importing sklearn from within the scikit-learn source directory produces >>> no such error.  Perhaps this would be a good fix

Re: [Scikit-learn-general] as_float_array with sparse matrices

2012-01-05 Thread Gael Varoquaux
On Thu, Jan 05, 2012 at 10:40:20PM +0100, Gael Varoquaux wrote: > Right now, when a sparse matrix is given at the validation utility > 'as_float_array', it crashes with the following incomprehensible error > message: Alright, it's not what I thought. Forget this message, apparently this incomprehe

[Scikit-learn-general] as_float_array with sparse matrices

2012-01-05 Thread Gael Varoquaux
Hi list, Right now, when a sparse matrix is given at the validation utility 'as_float_array', it crashes with the following incomprehensible error message: /usr/local/lib/python2.7/dist-packages/scikit_learn-0.10_git-py2.7-linux-x86_64. egg/sklearn/utils/validation.pyc in as_float_array(X, copy)

Re: [Scikit-learn-general] KMeans implementation in C with OpenMP

2012-01-05 Thread Fabian Pedregosa
On Thu, Jan 5, 2012 at 9:35 AM, Benjamin Hepp wrote: > My implementation is assuming all data fits in memory. I'll do some > benchmarks and look into the openmp/building/cython issues. Hey, Good look with the building issues. A safe solution would be to do conditional compilation. Scipy does it

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Jacob VanderPlas
OK I'll push to master Jake Andreas wrote: > It does :) > Thanks a lot! > > > On 01/05/2012 09:19 PM, Jacob VanderPlas wrote: > >> See if this fixes things: >> https://github.com/jakevdp/scikit-learn/tree/doc-math-fix >>Jake >> >> Andreas wrote: >> >> >>> On 01/05/2012 08:58 PM,

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Andreas
It does :) Thanks a lot! On 01/05/2012 09:19 PM, Jacob VanderPlas wrote: > See if this fixes things: > https://github.com/jakevdp/scikit-learn/tree/doc-math-fix >Jake > > Andreas wrote: > >> On 01/05/2012 08:58 PM, Andreas wrote: >> >> >>> Thanks Jake for looking into this. >>> I ha

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Jacob VanderPlas
See if this fixes things: https://github.com/jakevdp/scikit-learn/tree/doc-math-fix Jake Andreas wrote: > On 01/05/2012 08:58 PM, Andreas wrote: > >> Thanks Jake for looking into this. >> I have 8 pngs in this directory >> >> >> > If I do a "make clean" and a "make" again, I get

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Jacob VanderPlas
I get 328, followed by 244. I think a more careful removal of the images will fix this. Jake Andreas wrote: > On 01/05/2012 08:58 PM, Andreas wrote: > >> Thanks Jake for looking into this. >> I have 8 pngs in this directory >> >> >> > If I do a "make clean" and a "make" again,

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Andreas
On 01/05/2012 08:58 PM, Andreas wrote: > Thanks Jake for looking into this. > I have 8 pngs in this directory > > If I do a "make clean" and a "make" again, I get 328 ;) > On 01/05/2012 08:47 PM, Jacob VanderPlas wrote: > >> I'm having trouble replicating the problem. >> When you ``mak

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Andreas
Thanks Jake for looking into this. I have 8 pngs in this directory On 01/05/2012 08:47 PM, Jacob VanderPlas wrote: > I'm having trouble replicating the problem. > When you ``make html`` twice in a row, do you see anything in the > _build/html/_images/math directory? >Jake > > Andreas wrote

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Jacob VanderPlas
I'm having trouble replicating the problem. When you ``make html`` twice in a row, do you see anything in the _build/html/_images/math directory? Jake Andreas wrote: > On 01/05/2012 08:03 PM, Jacob VanderPlas wrote: > >> I wonder if this is a problem with that doc/image fix I put up during >

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Jacob VanderPlas
Definitely related. I guess the code should be modified to not use "rmtree" but to just remove the figure images alone. I'll take a look Jake Andreas wrote: > On 01/05/2012 08:03 PM, Jacob VanderPlas wrote: > >> I wonder if this is a problem with that doc/image fix I put up during >> the s

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Jacob VanderPlas
Emanuele, I should also note that a distinct advantage of cover trees is that, unlike ball tree, there is no need to compute the mean/median point of each node. This means that their storage can be much more compact, and they'd be very suitable to computing distances within sparse data. For t

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Andreas
On 01/05/2012 08:03 PM, Jacob VanderPlas wrote: > I wonder if this is a problem with that doc/image fix I put up during > the sprint? When the docs are re-made, all the math images are > removed. I recall checking and seeing that they were re-generated, but > I may be wrong. Can you check this,

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Jacob VanderPlas
I wonder if this is a problem with that doc/image fix I put up during the sprint? When the docs are re-made, all the math images are removed. I recall checking and seeing that they were re-generated, but I may be wrong. Can you check this, Andy? Jake Andreas wrote: > On 01/05/2012 07:53 PM

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Andreas
On 01/05/2012 07:53 PM, Gael Varoquaux wrote: > On Thu, Jan 05, 2012 at 07:44:12PM +0100, Andreas wrote: > >> Can anyone explain to me how to build the html docs >> so that the math is rendered with latex? >> > It should be. You need to use the ..math directive. > > I compared http:/

Re: [Scikit-learn-general] Building docs with math

2012-01-05 Thread Gael Varoquaux
On Thu, Jan 05, 2012 at 07:44:12PM +0100, Andreas wrote: > Can anyone explain to me how to build the html docs > so that the math is rendered with latex? It should be. You need to use the ..math directive. > This is the pngmath_latex sphinx plugin, right? It's actually fully done through matplot

[Scikit-learn-general] Building docs with math

2012-01-05 Thread Andreas
Hi everybody. Can anyone explain to me how to build the html docs so that the math is rendered with latex? This is the pngmath_latex sphinx plugin, right? I have latex and dvipng installed but the images don't show up. Anything else I need or any other build command? There seems to be nothing in th

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Olivier Grisel
Also we should ensure that all of those naming conventions for distances are consistent with what we already have in the sklearn.metrics.pairwise module. -- Olivier -- Ridiculously easy VDI. With Citrix VDI-in-a-Box, you

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Olivier Grisel
I would rename "tmp" as "work_buffer". Same for "VI" I don't understand the meaning either. I would also like to avoid "p" in the public API of kNN methods. Maybe use a "distance" parameter that could accept string values such as distance="euclidean" or distance="manhattan" or a float value that

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Gael Varoquaux
On Thu, Jan 05, 2012 at 09:33:28AM -0800, Jacob VanderPlas wrote: > In computing the mahalanobis distance, a temporary storage array is > needed. To avoid repeated allocation within the distance C-function > (and to avoid the need for malloc/free), I pre-allocate this temporary > array via nump

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Jacob VanderPlas
Gael Varoquaux wrote: > You are cimporting malloc and free. I have a personnal difficult > relationship with those two old friends. However, it seems not to be used > in the code. I just wanted to check. > I initially used malloc and free, but settled on the `tmp` pointer to avoid this (see be

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Jacob VanderPlas
Mathias, I'm glad you're excited to work on this! I think starting with just the minkowski p-distance in this case is a good idea, and it would be a great way for you to gain experience with the code. I'd do the following: - in sklearn/neighbors/base.py, add a parameter `p` to the NeighborsBas

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Gael Varoquaux
On Thu, Jan 05, 2012 at 08:33:01AM -0800, Jacob VanderPlas wrote: > Here's a small example I coded up that shows how I envision including > multiple distance metrics in BallTree > https://gist.github.com/1565998 > The idea is that you create functions to compute distance which expose C > functi

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Mathias Verbeke
Hi, First, thanks for all the answers! Waauw, really interesting discussion. I have only basic Python skills, and never programmed in Cython (together with a lot of time constraints, as most of you probably), but I would like to give it a try to add new distance metrics to the brute force method.

Re: [Scikit-learn-general] Other distance metrics for kNN

2012-01-05 Thread Jacob VanderPlas
Here's a small example I coded up that shows how I envision including multiple distance metrics in BallTree https://gist.github.com/1565998 The idea is that you create functions to compute distance which expose C function pointers, so that the ball tree cython code can call these without pytho

Re: [Scikit-learn-general] sklearn.test() weirdness

2012-01-05 Thread Andreas
On 01/05/2012 07:37 AM, Gael Varoquaux wrote: > On Wed, Jan 04, 2012 at 03:38:41PM -0800, Jacob VanderPlas wrote: > >> Importing sklearn from within the scikit-learn source directory produces >> no such error. Perhaps this would be a good fix >> > +1 > > +10 ---

Re: [Scikit-learn-general] KMeans implementation in C with OpenMP

2012-01-05 Thread Benjamin Hepp
My implementation is assuming all data fits in memory. I'll do some benchmarks and look into the openmp/building/cython issues. Benni On 12/23/11 10:46 PM, Gael Varoquaux wrote: > On Fri, Dec 23, 2011 at 10:42:45PM +0100, Gael wrote: >> The reason that we have integrated openmp code in the scikit