Re: [Numpy-discussion] 1.2.0rc2 tagged! --PLEASE TEST--
A Friday 03 October 2008, Jarrod Millman escrigué: On Thu, Oct 2, 2008 at 4:29 PM, Chris Barker [EMAIL PROTECTED] wrote: Robert Kern wrote: Superceded by the 1.2.0 release. See the thread ANN: NumPy 1.2.0. I thought I'd seen that, but when I went to: http://www.scipy.org/Download And I still got 1.1 I updated the page to point to the sourceforge page. Thanks for catching that. It would be nice if you can update the PYPI package index too. Perhaps having a list of places on where to announce NumPy on every release would be handy. Thanks! -- Francesc Alted ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Problème pour construire les tests Numpy-Swig
Bonjour, Je viens d'installer Numpy. Je suis intéressé par l'utilisation de swig. Lorsque je tente de construire les tests j'ai cette erreur: swig -c++ -python Array.i :9: Error: Macro '%typecheck' expects 1 argument :36: Error: Macro '%typecheck' expects 1 argument :64: Error: Macro '%typecheck' expects 1 argument :92: Error: Macro '%typecheck' expects 1 argument :119: Error: Macro '%typecheck' expects 1 argument :148: Error: Macro '%typecheck' expects 1 argument :177: Error: Macro '%typecheck' expects 1 argument :206: Error: Macro '%typecheck' expects 1 argument :235: Error: Macro '%typecheck' expects 1 argument ... Il semble que ca provienne des directives %numpy_typemaps a la fin du fichier numpy.i: /* Concrete instances of the %numpy_typemaps() macro: Each invocation * below applies all of the typemaps above to the specified data type. */ %numpy_typemaps(signed char , NPY_BYTE , int) %numpy_typemaps(unsigned char , NPY_UBYTE, int)*/ %numpy_typemaps(short , NPY_SHORT, int) /*%numpy_typemaps(unsigned short, NPY_USHORT , int) %numpy_typemaps(int , NPY_INT , int) %numpy_typemaps(unsigned int , NPY_UINT , int) %numpy_typemaps(long , NPY_LONG , int) %numpy_typemaps(unsigned long , NPY_ULONG, int) %numpy_typemaps(long long , NPY_LONGLONG , int) %numpy_typemaps(unsigned long long, NPY_ULONGLONG, int) %numpy_typemaps(float , NPY_FLOAT, int) %numpy_typemaps(double, NPY_DOUBLE , int) Est ce que quelqu'un a rencontré ce problème ? Merci de bien vouloir m'aider. Amicalement Michel _ Email envoyé avec Windows Live Hotmail. Dites adieux aux spam et virus, passez à Hotmail ! C'est gratuit ! http://www.windowslive.fr/hotmail/default.asp ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] RE: Problème pour constru ire les tests Numpy-Swig
Oh sorry I wrote my first email in french Hello, I just installed Numpy. I am interested in using Swig. When I try to build the tests I get the following error message: swig -c++ -python Array.i :9: Error: Macro '%typecheck' expects 1 argument :36: Error: Macro '%typecheck' expects 1 argument :64: Error: Macro '%typecheck' expects 1 argument :92: Error: Macro '%typecheck' expects 1 argument :119: Error: Macro '%typecheck' expects 1 argument :148: Error: Macro '%typecheck' expects 1 argument :177: Error: Macro '%typecheck' expects 1 argument :206: Error: Macro '%typecheck' expects 1 argument :235: Error: Macro '%typecheck' expects 1 argument ... It seems that the directive %numpy_typemaps is responsible for this error: /* Concrete instances of the %numpy_typemaps() macro: Each invocation * below applies all of the typemaps above to the specified data type. */ %numpy_typemaps(signed char , NPY_BYTE , int) %numpy_typemaps(unsigned char , NPY_UBYTE, int)*/ %numpy_typemaps(short , NPY_SHORT, int) /*%numpy_typemaps(unsigned short, NPY_USHORT , int) %numpy_typemaps(int , NPY_INT , int) %numpy_typemaps(unsigned int , NPY_UINT , int) %numpy_typemaps(long , NPY_LONG , int) %numpy_typemaps(unsigned long , NPY_ULONG, int) %numpy_typemaps(long long , NPY_LONGLONG , int) %numpy_typemaps(unsigned long long, NPY_ULONGLONG, int) %numpy_typemaps(float , NPY_FLOAT, int) %numpy_typemaps(double, NPY_DOUBLE , int) Somebody already faced this problem ? Thank you very much for any help. Friendly, Michel From: [EMAIL PROTECTED] To: numpy-discussion@scipy.org Date: Fri, 3 Oct 2008 12:13:26 +0200 Subject: [Numpy-discussion] Problème pour construire les tests Numpy-Swig Bonjour, Je viens d'installer Numpy. Je suis intéressé par l'utilisation de swig. Lorsque je tente de construire les tests j'ai cette erreur: swig -c++ -python Array.i :9: Error: Macro '%typecheck' expects 1 argument :36: Error: Macro '%typecheck' expects 1 argument :64: Error: Macro '%typecheck' expects 1 argument :92: Error: Macro '%typecheck' expects 1 argument :119: Error: Macro '%typecheck' expects 1 argument :148: Error: Macro '%typecheck' expects 1 argument :177: Error: Macro '%typecheck' expects 1 argument :206: Error: Macro '%typecheck' expects 1 argument :235: Error: Macro '%typecheck' expects 1 argument ... Il semble que ca provienne des directives %numpy_typemaps a la fin du fichier numpy.i: /* Concrete instances of the %numpy_typemaps() macro: Each invocation * below applies all of the typemaps above to the specified data type. */ %numpy_typemaps(signed char , NPY_BYTE , int) %numpy_typemaps(unsigned char , NPY_UBYTE, int)*/ %numpy_typemaps(short , NPY_SHORT, int) /*%numpy_typemaps(unsigned short, NPY_USHORT , int) %numpy_typemaps(int , NPY_INT , int) %numpy_typemaps(unsigned int , NPY_UINT , int) %numpy_typemaps(long , NPY_LONG , int) %numpy_typemaps(unsigned long , NPY_ULONG, int) %numpy_typemaps(long long , NPY_LONGLONG , int) %numpy_typemaps(unsigned long long, NPY_ULONGLONG, int) %numpy_typemaps(float , NPY_FLOAT, int) %numpy_typemaps(double, NPY_DOUBLE , int) Est ce que quelqu'un a rencontré ce problème ? Merci de bien vouloir m'aider. Amicalement Michel _ Email envoyé avec Windows Live Hotmail. Dites adieux aux spam et virus, passez à Hotmail ! C'est gratuit ! http://www.windowslive.fr/hotmail/default.asp ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion _ Téléphonez gratuitement à tous vos proches avec Windows Live Messenger ! Téléchargez-le maintenant ! http://www.windowslive.fr/messenger/1.asp ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] choose() broadcasting, and Trac
Hello, I have found something I call a bug in the numpy choose() method and wanted to report it in trac. http://scipy.org/BugReport states that SciPy and NumPy Developer Pages use the same login/password. However, I (username smoerz) can log in with my Scipy account at the Scipy Developer Page (http://projects.scipy.org/scipy/scipy/), but not at the Numpy Developer Page (http://projects.scipy.org/scipy/numpy/). Whatever, porting some code from numarray to numpy, I found a regression in the broadcasting of choose(): import numarray, numpy numarray.choose([[0,0,1], [0,0,1]], ([2,2,2], [3,3,3])) array([[2, 2, 3], [2, 2, 3]]) numarray.choose([0,0,1], ([[2,2,2],[2,2,2]], [[3,3,3],[3,3,3]])) array([[2, 2, 3], [2, 2, 3]]) numarray.choose([0,0,1], ([2,2,2], [[3,3,3],[3,3,3]])) array([[2, 2, 3], [2, 2, 3]]) Of these 3 cases, only the first one works for numpy, for the other ones I get: /usr/lib/python2.5/site-packages/numpy/core/fromnumeric.pyc in choose(a, choices, out, mode) 167 choose = a.choose 168 except AttributeError: -- 169 return _wrapit(a, 'choose', choices, out=out, mode=mode) 170 return choose(choices, out=out, mode=mode) 171 /usr/lib/python2.5/site-packages/numpy/core/fromnumeric.pyc in _wrapit(obj, method, *args, **kwds) 35 except AttributeError: 36 wrap = None --- 37 result = getattr(asarray(obj),method)(*args, **kwds) 38 if wrap: 39 if not isinstance(result, mu.ndarray): ValueError: too many dimensions I consider this as a bad regression from numarray to numpy, because the failing broadcast examples seem to be more important than the working one. Best Regards, Roman ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.2.0rc2 tagged! --PLEASE TEST--
On Fri, Oct 3, 2008 at 12:35 AM, Francesc Alted [EMAIL PROTECTED] wrote: It would be nice if you can update the PYPI package index too. Perhaps having a list of places on where to announce NumPy on every release would be handy. Done. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Help to process a large data file
Frank, On Thu, Oct 2, 2008 at 3:20 PM, frank wang [EMAIL PROTECTED] wrote: Thans David and Chris for providing the nice solution. Glad it helped. Both method works gread. I could not tell the speed difference between the two solutions. My data size is 1048577 lines. I'd be curious to know what happens for larger files (~ 10 M lines). I'd guess Chris solution would be the fastest since it works incrementally and does not load the entire data in memory. If you ever try, I'll be interested to know how it turns out. David I did not try the second solution from Chris since it is too slow as Chris stated. Frank Date: Thu, 2 Oct 2008 17:43:37 +0200 From: [EMAIL PROTECTED] To: numpy-discussion@scipy.org CC: [EMAIL PROTECTED] Subject: Re: [Numpy-discussion] Help to process a large data file Frank, I would imagine that you cannot get a much better performance in python than this, which avoids string conversions: c = [] count = 0 for line in open('foo'): if line == '1 1\n': c.append(count) count = 0 else: if '1' in line: count += 1 One could do some numpy trick like: a = np.loadtxt('foo',dtype=int) a = np.sum(a,axis=1) # Add the two columns horizontally b = np.where(a==2)[0] # Find with sum == 2 (1 + 1) count = [] for i,j in zip(b[:-1],b[1:]): count.append( a[i+1:j].sum() ) # Calculate number of lines with 1 but on my machine the numpy version takes about 20 sec for a 'foo' file of 2,500,000 lines versus 1.2 sec for the pure python version... As a side note, if i replace line == '1 1\n' with line.startswith('1 1'), the pure python version goes up to 1.8 sec... Isn't this a bit weird, i'd think startswith() should be faster... Chris On Wed, Oct 01, 2008 at 07:27:27PM -0600, frank wang wrote: Hi, I have a large data file which contains 2 columns of data. The two columns only have zero and one. Now I want to cound how many one in between if both columns are one. For example, if my data is: 1 0 0 0 1 1 0 0 0 1 x 0 1 x 0 0 0 1 x 1 1 0 0 0 1 x 0 1 x 1 1 Then my count will be 3 and 2 (the numbers with x). Are there an efficient way to do this? My data file is pretty big. Thanks Frank ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- See how Windows connects the people, information, and fun that are part of your life. See Nowhttp://clk.atdmt.com/MRT/go/msnnkwxp1020093175mrt/direct/01/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] choose() broadcasting, and Trac
Roman Bertle wrote: Hello, I have found something I call a bug in the numpy choose() method and wanted to report it in trac. Thanks for your report. I'm not sure why you are having trouble with Trac, but I've created a ticket for this problem. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] RE: RE: Problème pour construire les tests Numpy-Swig
I was using swig 1.3.24. I installed the last swig version 1.3.36 and now it is working fine ! and it makes me very very happy !!! From: [EMAIL PROTECTED] To: numpy-discussion@scipy.org Date: Fri, 3 Oct 2008 12:21:20 +0200 Subject: [Numpy-discussion] RE: Problème pour construire les tests Numpy-Swig Oh sorry I wrote my first email in french Hello, I just installed Numpy. I am interested in using Swig. When I try to build the tests I get the following error message: swig -c++ -python Array.i :9: Error: Macro '%typecheck' expects 1 argument :36: Error: Macro '%typecheck' expects 1 argument :64: Error: Macro '%typecheck' expects 1 argument :92: Error: Macro '%typecheck' expects 1 argument :119: Error: Macro '%typecheck' expects 1 argument :148: Error: Macro '%typecheck' expects 1 argument :177: Error: Macro '%typecheck' expects 1 argument :206: Error: Macro '%typecheck' expects 1 argument :235: Error: Macro '%typecheck' expects 1 argument ... It seems that the directive %numpy_typemaps is responsible for this error: /* Concrete instances of the %numpy_typemaps() macro: Each invocation * below applies all of the typemaps above to the specified data type. */ %numpy_typemaps(signed char , NPY_BYTE , int) %numpy_typemaps(unsigned char , NPY_UBYTE, int)*/ %numpy_typemaps(short , NPY_SHORT, int) /*%numpy_typemaps(unsigned short, NPY_USHORT , int) %numpy_typemaps(int , NPY_INT , int) %numpy_typemaps(unsigned int , NPY_UINT , int) %numpy_typemaps(long , NPY_LONG , int) %numpy_typemaps(unsigned long , NPY_ULONG, int) %numpy_typemaps(long long , NPY_LONGLONG , int) %numpy_typemaps(unsigned long long, NPY_ULONGLONG, int) %numpy_typemaps(float , NPY_FLOAT, int) %numpy_typemaps(double, NPY_DOUBLE , int) Somebody already faced this problem ? Thank you very much for any help. Friendly, Michel From: [EMAIL PROTECTED] To: numpy-discussion@scipy.org Date: Fri, 3 Oct 2008 12:13:26 +0200 Subject: [Numpy-discussion] Problème pour construire les tests Numpy-Swig Bonjour, Je viens d'installer Numpy. Je suis intéressé par l'utilisation de swig. Lorsque je tente de construire les tests j'ai cette erreur: swig -c++ -python Array.i :9: Error: Macro '%typecheck' expects 1 argument :36: Error: Macro '%typecheck' expects 1 argument :64: Error: Macro '%typecheck' expects 1 argument :92: Error: Macro '%typecheck' expects 1 argument :119: Error: Macro '%typecheck' expects 1 argument :148: Error: Macro '%typecheck' expects 1 argument :177: Error: Macro '%typecheck' expects 1 argument :206: Error: Macro '%typecheck' expects 1 argument :235: Error: Macro '%typecheck' expects 1 argument ... Il semble que ca provienne des directives %numpy_typemaps a la fin du fichier numpy.i: /* Concrete instances of the %numpy_typemaps() macro: Each invocation * below applies all of the typemaps above to the specified data type. */ %numpy_typemaps(signed char , NPY_BYTE , int) %numpy_typemaps(unsigned char , NPY_UBYTE, int)*/ %numpy_typemaps(short , NPY_SHORT, int) /*%numpy_typemaps(unsigned short, NPY_USHORT , int) %numpy_typemaps(int , NPY_INT , int) %numpy_typemaps(unsigned int , NPY_UINT , int) %numpy_typemaps(long , NPY_LONG , int) %numpy_typemaps(unsigned long , NPY_ULONG, int) %numpy_typemaps(long long , NPY_LONGLONG , int) %numpy_typemaps(unsigned long long, NPY_ULONGLONG, int) %numpy_typemaps(float , NPY_FLOAT, int) %numpy_typemaps(double, NPY_DOUBLE , int) Est ce que quelqu'un a rencontré ce problème ? Merci de bien vouloir m'aider. Amicalement Michel _ Email envoyé avec Windows Live Hotmail. Dites adieux aux spam et virus, passez à Hotmail ! C'est gratuit ! http://www.windowslive.fr/hotmail/default.asp ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion _ Téléphonez gratuitement à tous vos proches avec Windows Live Messenger ! Téléchargez-le maintenant ! http://www.windowslive.fr/messenger/1.asp ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion _ Installez gratuitement les 20 émôticones Windows Live Messenger les plus fous ! Cliquez ici !
[Numpy-discussion] Array shape
I'm using Numpy to do some basic array manipulation, and I'm getting some unexpected behavior from shape. Specifically, I have some 3x3 and 2x2 matrices, and shape gives me (5, 3) and (3, 2) for their respective sizes. I was expecting (3, 3) and (2, 2), for number of rows, number of columns. I'm assuming I must either be misunderstanding what shape gives you or doing something wrong. Can anybody give me any advice? I'm using Python 2.5 and Numpy 1.1.0. Thanks, Kelly ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array shape
Kelly Vincent wrote: I'm using Numpy to do some basic array manipulation, and I'm getting some unexpected behavior from shape. Specifically, I have some 3x3 and 2x2 matrices, and shape gives me (5, 3) and (3, 2) for their respective sizes. I was expecting (3, 3) and (2, 2), for number of rows, number of columns. I'm assuming I must either be misunderstanding what shape gives you or doing something wrong. Can anybody give me any advice? I'm using Python 2.5 and Numpy 1.1.0. Can you post a complete, minimal example that shows the problem you have? For an array object A, A.shape should give the shape you're expecting. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Problème pour construire les tests Numpy-Swig
On Fri, Oct 3, 2008 at 11:21 AM, Michel Dupront [EMAIL PROTECTED] wrote: I was using swig 1.3.24. I installed the last swig version 1.3.36 and now it is working fine ! and it makes me very very happy !!! SWIG often has that effect on people :) -- Nathan Bell [EMAIL PROTECTED] http://graphics.cs.uiuc.edu/~wnbell/ Satisfied SWIG customer ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] making numpy.dot faster
I am doing a calculation where one call numpy.dot ends up taking 90% of the time (the array is huge: (61373, 500) ). Any chance I can make this faster? I would believe BLAS/ATLAS would be behind this, but from my quick analysis (ldd on numpy/core/multiarray.so) it doesn't seem so. Have I done something stupid when building numpy (disclaimer: I am on a system I don't know well --Mandriva--, so I could very well have done something stupid). Cheers, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] making numpy.dot faster
On Fri, Oct 3, 2008 at 10:59 AM, Gael Varoquaux [EMAIL PROTECTED] wrote: I am doing a calculation where one call numpy.dot ends up taking 90% of the time (the array is huge: (61373, 500) ). Any chance I can make this faster? I would believe BLAS/ATLAS would be behind this, but from my quick analysis (ldd on numpy/core/multiarray.so) it doesn't seem so. Have I done something stupid when building numpy (disclaimer: I am on a system I don't know well --Mandriva--, so I could very well have done something stupid). What does np.__config__.show() show? What exactly are you multiplying? What is the original problem? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Co-existing Numeric and numpy?
Hi, We have some older 3rd party packages that require Numeric (24.2), but like to also use newer 3rd party packages that require a recent numpy. Can Numeric and numpy co-exist in the same process? -- I'm mainly worried about clashes at the C API level. Thanks! Ralf ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Co-existing Numeric and numpy?
Ralf W. Grosse-Kunstleve wrote: We have some older 3rd party packages that require Numeric (24.2), but like to also use newer 3rd party packages that require a recent numpy. Can Numeric and numpy co-exist in the same process? -- I'm mainly worried about clashes at the C API level. nope, there are no problems. In fact, converting between the two arrays types with asarray() works very efficiently. I sure wish everyone would make the transition, though. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Co-existing Numeric and numpy?
Thank you very much for the information! There are weird but important reasons why we are stuck with Numeric 24.2 for the foreseeable future. It is extremely valuable that you provide a smooth upgrade path. Ralf - Original Message From: Christopher Barker [EMAIL PROTECTED] To: Discussion of Numerical Python numpy-discussion@scipy.org Sent: Friday, October 3, 2008 12:12:48 PM Subject: Re: [Numpy-discussion] Co-existing Numeric and numpy? Ralf W. Grosse-Kunstleve wrote: We have some older 3rd party packages that require Numeric (24.2), but like to also use newer 3rd party packages that require a recent numpy. Can Numeric and numpy co-exist in the same process? -- I'm mainly worried about clashes at the C API level. nope, there are no problems. In fact, converting between the two arrays types with asarray() works very efficiently. I sure wish everyone would make the transition, though. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] making numpy.dot faster
Fri, 03 Oct 2008 18:59:02 +0200, Gael Varoquaux wrote: I am doing a calculation where one call numpy.dot ends up taking 90% of the time (the array is huge: (61373, 500) ). Any chance I can make this faster? I would believe BLAS/ATLAS would be behind this, but from my quick analysis (ldd on numpy/core/multiarray.so) it doesn't seem so. Have I done something stupid when building numpy (disclaimer: I am on a system I don't know well --Mandriva--, so I could very well have done something stupid). AFAIK, multiarray.so is never linked against ATLAS. The accelerated dot implementation is in _dotblas.so, and can be toggled with alterdot/ restoredot (but the ATLAS one should be active by default). numpy.dot.__module__ 'numpy.core._dotblas' Are your arrays appropriately contiguous? Numpy needs to copy the data if they are not; though I'm not sure if this could account for what you see. -- Pauli Virtanen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: scipy.spatial
I remember reading a paper or book that stated that for data that has been normalized correlation and Euclidean are equivalent and will produce the same knn results. To this end I spent a couple hours this afternoon doing the math. This document is the result. http://www.cs.colostate.edu/~bolme/bolme08euclidean.pdf I believe that with mean subtracted and unit length vectors, a Euclidean knn algorithm will produces the same result as if the vectors were compared using correlation. I am not sure if kd-trees will perform well on the normalized vectors as they have a very specific geometry. If my math checks out it may be worth adding Pearson's correlation as a default option or as a separate class. I have also spent a little time looking at kd-trees and the kdtree code. It looks good. As I understand it kd-trees only work well when the number of datapoints (N) is larger than 2^D, where D is the dimensionality of those points. This will not work well for many of my computer vision problems because often D is large. As Anne suggested I will probably look at cover trees because often times the data are low-dimensional data in high-dimensional spaces. I have been told though that with a large D there is know known fast algorithm for knn. Another problem is that the distances and similarity measures used in biometrics and computer vision are often very specialized and may or may not conform to the underlying assumptions of fast algorithms. I think for this reason I will need an exhaustive search algorithm. I will code it up modeled after Anne's interface and hopefully it will make it into the spatial module. I think that kd-trees and the spatial module are a good contribution to scipy. I have also enjoyed learning more about norms, correlation, and fast knn algorithms. Thanks, Dave ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal: scipy.spatial
2008/10/3 David Bolme [EMAIL PROTECTED]: I remember reading a paper or book that stated that for data that has been normalized correlation and Euclidean are equivalent and will produce the same knn results. To this end I spent a couple hours this afternoon doing the math. This document is the result. http://www.cs.colostate.edu/~bolme/bolme08euclidean.pdf Yes, you're right, even without the mean subtraction they all lie on a hypersphere in Euclidean space. It's a little more awkward if you want to identify x and -x. I believe that with mean subtracted and unit length vectors, a Euclidean knn algorithm will produces the same result as if the vectors were compared using correlation. I am not sure if kd-trees will perform well on the normalized vectors as they have a very specific geometry. If my math checks out it may be worth adding Pearson's correlation as a default option or as a separate class. Actually it's probably easier if the user just does the prenormalization. I have also spent a little time looking at kd-trees and the kdtree code. It looks good. As I understand it kd-trees only work well when the number of datapoints (N) is larger than 2^D, where D is the dimensionality of those points. This will not work well for many of my computer vision problems because often D is large. As Anne suggested I will probably look at cover trees because often times the data are low-dimensional data in high-dimensional spaces. I have been told though that with a large D there is know known fast algorithm for knn. Pretty much true. Though if the intrinsic dimensionality is low, cover trees should be all right. Another problem is that the distances and similarity measures used in biometrics and computer vision are often very specialized and may or may not conform to the underlying assumptions of fast algorithms. I think for this reason I will need an exhaustive search algorithm. I will code it up modeled after Anne's interface and hopefully it will make it into the spatial module. Metric spaces are quite general - edit distance for strings, for example, is an adequate distance measure. But brute-force is definitely worth having too. If I get the test suite cleaned up, it should be possible to just drop an arbitrary k-nearest-neighbors class into it and get a reasonably thorough collection of unit tests. Anne ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion