Re: [Numpy-discussion] Numpy-discussion Digest, Vol 19, Issue 24
Quoting [EMAIL PROTECTED]: What will be the licensing of this project? Do you know yet? I am thinking GPL for the compiler and LGPL for any runtime components which should be similar to GCC. Which version : Version 2 or version 3 of the license is undecided. Will also check with uni to see if they have any problems (they shouldnt). Also need to check with uni for hosting. I believe I will need to host on uni servers. I have a couple of comments because I've been thinking along these lines. What is Spyke? In many performance critical projects, it is often necessary to rewrite parts of the application in C. However writing C wrappers can be time consuming. Spyke offers an alternative approach. You add annotations to your Python code as strings. These strings are discarded by the Python interpreter but these are interpreted as types by Spyke compiler to convert to C. Example : int - int def f(x): return 2*x In this case the Spyke compiler will consider the string int - int as a decalration that the function accepts int as parameter and returns int. Spyke will then generate a C function and a wrapper function. What about the use of decorators in this case? I can certainly use decorators. Will implement this change soon. Also, it would be great to be able to create ufuncs (and general numpy funcs) using this approach. A decorator would work well here as well. Where is Spyke? Spyke will be available as a binary only release in a couple of weeks. I intend to make it open source after a few months. I'd like to encourage you to make it available as open source as early as possible.I think you are likely to get help in ways you didn't expect. People are used to reading code, so even an alpha project can get help early. In fact given that you are looking for help. I think this may be the best way to get it. Ok .. I will release the source along with the binary. Need to sort some stuff out so might take a couple of weeks. Note that much of the compiler is (for better or worse) written in Java. The codebase isnt very OOP (full of static methods and looks more like garbage collected C) but not too complex either. I use cpython's compiler module to dump the AST into an intermediate file which is then parsed by the compiler in java. The compiler is using AST representation throughout. The compiler also depends upon the antlr java runtime. For hosting, I will probably get some space at the univ servers. I will try to get trac installed. I will release when all the following work: a) Basic support for functions and classes. b) Keyword parameters not supported. c) Special methods not supported except __init__. d) __init__ is treated as constructor. Custom __new__ not supported. e) Nested functions may be broken. f) Functions will be divided into 2 types : static and dynamic. Static functions should not be redefined at runtime while dynamic functions can be redefined at runtime but will be more costly to call since I need to lookup the binding at each time. Also even though dynamic functions can be redefined its type signature should not change. If a static function calls another static function, then the compiler will try to insert a call to the C function instead of wrapped function thus bypassing the interpreter if possible. g) Compiled classes should not redefine methods at runtime. Will have an option to annotate classes as final meaning user shouldnt subclass it. For such classes, its easier to generate efficient code for attribute access. Also compiled classes shouldnt dynamically add/delete attributes. h) Users shouldnt subclass numpy array. i) For method calls on objects, mostly the code generated will just end up making a call to interpreter thus the performance in this case will not be particularly good currently. For ints, floats etc the equivalent C code will be generated so for these types the code should be fast enough. j) For indexing of numpy arrays, unsafe code is generated. I directly access the array without any index checking. k) Loops : This is the weakest point currently. I only allow for-loops over range() or xrange() allowing easy conversion to C. Cannot loop over elements of other lists or numpy arrays etc. l) Exec, eval, metaclasses, dynamic class creation, dynamic adding/deleting attributes etc not allowed inside typed code. m) A module cannot currently mix typed and untyped code. A module has to be completely typed/annotated or it should be left alone and not compiled. Also a typed module cannot have arbitrary executable code and should only consist of single statement variable declarations, function and class definitions. Of course rest of your application can be left untyped. In the future I will try allow mixing typed and untyped code in a module. n) Importing of other typed modules also mostly supported. o) Builtin functions : range and len mostly work. But
Re: [Numpy-discussion] Numpy-discussion Digest, Vol 19, Issue 24
On Mon, Apr 7, 2008 at 12:19 AM, Rahul Garg [EMAIL PROTECTED] wrote: Quoting [EMAIL PROTECTED]: What will be the licensing of this project? Do you know yet? I am thinking GPL for the compiler and LGPL for any runtime components which should be similar to GCC. Which version : Version 2 or version 3 of the license is undecided. Will also check with uni to see if they have any problems (they shouldnt). Also need to check with uni for hosting. I believe I will need to host on uni servers. Scipy and Numpy are BSD and don't include GPL components, so you might want to consider a more liberal license. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Spyke python-to-C compiler Was: Numpy-discussion Digest, Vol 19, Issue 24
A Monday 07 April 2008, Charles R Harris escrigué: On Mon, Apr 7, 2008 at 12:19 AM, Rahul Garg [EMAIL PROTECTED] wrote: Quoting [EMAIL PROTECTED]: What will be the licensing of this project? Do you know yet? I am thinking GPL for the compiler and LGPL for any runtime components which should be similar to GCC. Which version : Version 2 or version 3 of the license is undecided. Will also check with uni to see if they have any problems (they shouldnt). Also need to check with uni for hosting. I believe I will need to host on uni servers. Scipy and Numpy are BSD and don't include GPL components, so you might want to consider a more liberal license. As I see it, and provided that parts of Spyke are written in Java, it would be very unlikely that this will ever be included in NumPy/SciPy. It looks like a compiler, so having the same licensing than GCC shouldn't bother the NumPy community, IMO. Cheers, -- 0,0 Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data - ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New project : Spyke python-to-C compiler
Hi Rahul, Nice project. I think you are taking the right direction with type annotations. I you get this working and reliable, you will be much loved by the community. On Sun, Apr 06, 2008 at 11:19:58PM -0500, Travis E. Oliphant wrote: c) Strings as type declarations : Do you think I should use decorators instead at least for function type declarations? I think you should use decorators. That way you can work towards having the compiler embedded in the decorator and happen seamlessly without invoking a separte program (it just happens when the module is loaded -- a.l.a weave). +1. This is a very promising route. You can then choose exactly what you want to compile and what you want to keep pure Python (something similar to Cython, without the intermediate file). I would even stick with some decoration after Python3K, say @compiled(), so that you keep this compilation on the fly that Travis is mentionning. Cheers, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New project : Spyke python-to-C compiler
(Though as the saying goes, little duplication is normal (and perhaps wanted) for open source software.) Sorry! I meant a little, completely reversing the meaning of my sentence. Dag Sverre ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PCA on set of face images
Am Dienstag, 11. März 2008 00:24:04 schrieb David Bolme: The steps you describe here are correct. I am putting together an open source computer vision library based on numpy/scipy. It will include an automatic PCA algorithm with face detection, eye detection, PCA dimensionally reduction, and distance measurement. If you are interested let me know and I will redouble my efforts to release the code soon. That's interesting, we're also working on a computer vision module using NumPy (actually, a VIGRA - NumPy binding sort of), and there's scipy.ndimage, too. Maybe (part of) your code could be integrated into the latter? I am looking forward to it anyway. -- Ciao, / / /--/ / / ANS ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New project : Spyke python-to-C compiler
What is Spyke? In many performance critical projects, it is often necessary to rewrite parts of the application in C. However writing C wrappers can be time consuming. Spyke offers an alternative approach. You add annotations to your Python code as strings. These strings are discarded by the Python interpreter but these are interpreted as types by Spyke compiler to convert to C. Have you had a look at Cython? http://cython.org. From seeing what you write, it looks like we have almost exactly the same long-term goals; one could almost say that the two pieces of software will be complete duplicates in functionality. (Cython isn't there just yet though.) (Though as the saying goes, little duplication is normal (and perhaps wanted) for open source software.) Dag Sverre ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] New project : Spyke python-to-C compiler
On Sun, Apr 6, 2008 at 8:48 PM, Rahul Garg [EMAIL PROTECTED] wrote: function. This idea is directly copied from PLW (Python Language Wrapper) project. Once Python3k arrives, much of these declarations will be moved to function annotations and class decorators. Python 3k alpha4 is available. Why not skip the string based version and go directly to using the function annotation capability of 3.0. Your project could be a good test case for the new feature of the language. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram
Am Samstag, 05. April 2008 21:54:27 schrieb Anne Archibald: There's also a fourth option - raise an exception if any points are outside the range. +1 I think this should be the default. Otherwise, I tend towards exclude, in order to have comparable bin sizes (when plotting, I always find peaks at the ends annoying); this could also be called clip BTW. But really, an exception would follow the Zen: In the face of ambiguity, refuse the temptation to guess. And with a kwarg: Explicit is better than implicit. histogram(a, arange(10), outliers = clip) histogram(a, arange(10), outliers = include) # better names? include-accumulate/map to border/map/boundary -- Ciao, / / /--/ / / ANS ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy-discussion Digest, Vol 19, Issue 24
On Mon, 07 Apr 2008, Rahul Garg apparently wrote: I am thinking GPL for the compiler and LGPL for any runtime components which should be similar to GCC. Which version : Version 2 or version 3 of the license is undecided. The author determines the license, of course. But don't forget to consider this one. URL:http://www.python.org/psf/license/ Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] varu and stdu
Anne, Travis, I have no problem to get rid of varu and stdu in MaskedArray: they were introduced for my own convenience, and they're indeed outdated with the introduction of the ddof parameters. I'll get rid of them next time I post something on the SVN. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram
+1 for an outlier keyword. Note, that this implies that when bins are passed explicitly, the edges are given (nbins+1), not simply the left edges (nbins). While we are refactoring histogram, I'd suggest adding an axis keyword. This is pretty straightforward to implement using the np.apply_along_axis function. Also, I noticed that current normalization is buggy for non-uniform bin sizes. if normed: db = bins[1] - bins[0] return 1.0/(a.size*db) * n, bins Finally, whatever option is chosen in the end, we should make sure it is consistent across all histogram functions. This may mean that we will also break the behavior of histogramdd and histogram2d. Bruce: I did some work over the weekend on the histogram function, including tests. If you want, I'll send that to you in the evening. David 2008/4/7, Hans Meine [EMAIL PROTECTED]: Am Samstag, 05. April 2008 21:54:27 schrieb Anne Archibald: There's also a fourth option - raise an exception if any points are outside the range. +1 I think this should be the default. Otherwise, I tend towards exclude, in order to have comparable bin sizes (when plotting, I always find peaks at the ends annoying); this could also be called clip BTW. But really, an exception would follow the Zen: In the face of ambiguity, refuse the temptation to guess. And with a kwarg: Explicit is better than implicit. histogram(a, arange(10), outliers = clip) histogram(a, arange(10), outliers = include) # better names? include-accumulate/map to border/map/boundary -- Ciao, / / /--/ / / ANS ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Any multigrid package to recommend ?
Dear Guys @list : I wanna do some application with mulgrid method for electrostatic problems, is there some python package available for my purpose ? Best, -- Hai-Ping Lan Department of Electronics , Peking University , Bejing, 100871 [EMAIL PROTECTED], [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram
Hi, Thanks David for pointing the piece of information I forgot to add in my original email. -1 for 'raise an exception' because, as Dan points out, the problem stems from user providing bins. +1 for the outliers keyword. Should 'exclude' distinguish points that are too low and those that are too high? +1 for axis. Really I was only looking at seeing what it would take to close this bug, but I am willing to test any code. Thanks Bruce On Mon, Apr 7, 2008 at 8:55 AM, David Huard [EMAIL PROTECTED] wrote: +1 for an outlier keyword. Note, that this implies that when bins are passed explicitly, the edges are given (nbins+1), not simply the left edges (nbins). While we are refactoring histogram, I'd suggest adding an axis keyword. This is pretty straightforward to implement using the np.apply_along_axis function. Also, I noticed that current normalization is buggy for non-uniform bin sizes. if normed: db = bins[1] - bins[0] return 1.0/(a.size*db) * n, bins Finally, whatever option is chosen in the end, we should make sure it is consistent across all histogram functions. This may mean that we will also break the behavior of histogramdd and histogram2d. Bruce: I did some work over the weekend on the histogram function, including tests. If you want, I'll send that to you in the evening. David 2008/4/7, Hans Meine [EMAIL PROTECTED]: Am Samstag, 05. April 2008 21:54:27 schrieb Anne Archibald: There's also a fourth option - raise an exception if any points are outside the range. +1 I think this should be the default. Otherwise, I tend towards exclude, in order to have comparable bin sizes (when plotting, I always find peaks at the ends annoying); this could also be called clip BTW. But really, an exception would follow the Zen: In the face of ambiguity, refuse the temptation to guess. And with a kwarg: Explicit is better than implicit. histogram(a, arange(10), outliers = clip) histogram(a, arange(10), outliers = include) # better names? include-accumulate/map to border/map/boundary -- Ciao, / / /--/ / / ANS ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
2008/4/4, Joe Harrington [EMAIL PROTECTED]: import numpy as N import numpy.math as N.M import numpy.trig as N.T import numpy.stat as N.S I don't think the issue is whether to put everything in the base namespace // everything in individual namespace, but rather to find an optimal and intuitive mix between the two. For instance, the io functions would be easier to find by typing np.io.loadtxt than by sifting through the 500+ items of the base namespace. The stats functions could equally well be in a separate namespace, given that the most used are implemented as array methods. I think this would allow numpy to grow more gracefully. As for the financial functions, being specific to a discipline, I think they rather belongs with scipy. The numpy namespace will quickly become a mess if we add np.geology, np.biology, np.material, etc. Of course, this raises the problem of distributing scipy, and here is a suggestion: Change the structure of scipy so that it looks like the scikits: scipy/ sparse/ cluster/ financial/ ... fftpack/ setup.py scipy/ __init__.py fftpack/ The advantage is that each subpackage can be installed independently of the others. For distribution, we could lump all the pure python or easy to compile packages into scipy.common, and distribute the other packages such as sparse and fftpack independently. My feeling is that such a lighter structure would encourage projects with large code base to join the scipy community. It would also allow folks with 56k modems to download only what they need. David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On 07/04/2008, David Huard [EMAIL PROTECTED] wrote: 2008/4/4, Joe Harrington [EMAIL PROTECTED]: import numpy as N import numpy.math as N.M import numpy.trig as N.T import numpy.stat as N.S I don't think the issue is whether to put everything in the base namespace // everything in individual namespace, but rather to find an optimal and intuitive mix between the two. For instance, the io functions would be easier to find by typing np.io.loadtxt than by sifting through the 500+ items of the base namespace. The stats functions could equally well be in a separate namespace, given that the most used are implemented as array methods. I think this would allow numpy to grow more gracefully. I agree, and I think we can come to some compromise -- maybe a numpy.all namespace, that simply imports all the other subpackages. Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] varu and stdu
I know I'm off topic and maybe a day late, but I'm pained by the naming of ddof. It is simply not intuitive for anyone other than the person who thought it up (and from my recollection, maybe not even for him).For one, most stats folk use 'df' as the abbreviation for 'degrees of freedom'. Secondly, the we tend to think of the constant bias adjustment as an ~adjustment~ of the sample size or df. So 'df_adjust=0' or 'sample_adjust=0' will resonate much more. The other issue is to clearly describe if 'N-1' is obtained by setting the adjustment (whatever it is called) to +1 or -1. There is a reason why most stats packages have different functions or take a parameter to indicate 'sample' versus 'population' variance calculation. Though don't take this as a recommendation to use var and varu -- rather I'm merely pointing out that var(X, vardef='sample') is an option (using SAS's PROC MEANS parameter name as an arbitrary example). In the extremely rare cases I need any other denominator, I'm fine with multiplying by var(x)*n/(n-adjust). -Kevin On Mon, Apr 7, 2008 at 9:41 AM, Pierre GM [EMAIL PROTECTED] wrote: Anne, Travis, I have no problem to get rid of varu and stdu in MaskedArray: they were introduced for my own convenience, and they're indeed outdated with the introduction of the ddof parameters. I'll get rid of them next time I post something on the SVN. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 07, 2008 at 05:20:47PM +0200, Stéfan van der Walt wrote: I agree, and I think we can come to some compromise -- maybe a numpy.all namespace, that simply imports all the other subpackages. For the beginner, from numpy.all import * is more confusing than from numpy import * (which is already confusing). I know namespace are good things, but the beginner struggles with them. This is why I used the import * in my above examples. My 2 cents, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Any multigrid package to recommend ?
I am the lead developer of PyTrilinos, a python interface to the Trilinos project: http://trilinos.sandia.gov. Trilinos has many packages related to solvers, including ML, the multilevel preconditioning package. There may be a little bit of a learning curve, but there are example scripts to look at. I also built in quite a bit of compatibility with numpy. On Apr 7, 2008, at 8:14 AM, lan haiping wrote: Dear Guys @list : I wanna do some application with mulgrid method for electrostatic problems, is there some python package available for my purpose ? ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370Email: [EMAIL PROTECTED] ** ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Interaction between Numpy and the nose framework (was : packaging scipy)
BTW, I stumbled on something strange with the nose framework. If you from numpy.testing import * in a test file, the nose framework will try to test the testing module by calling every test* method. I just mention it there because I think I'm not the only one to do this for set_package_path, assert_equal, ... Matthieu 2008/4/7, Gael Varoquaux [EMAIL PROTECTED]: On Mon, Apr 07, 2008 at 05:20:47PM +0200, Stéfan van der Walt wrote: I agree, and I think we can come to some compromise -- maybe a numpy.all namespace, that simply imports all the other subpackages. For the beginner, from numpy.all import * is more confusing than from numpy import * (which is already confusing). I know namespace are good things, but the beginner struggles with them. This is why I used the import * in my above examples. My 2 cents, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Any multigrid package to recommend ?
Thank you, Bill. I will try to fill that curve . On Mon, Apr 7, 2008 at 11:55 PM, Bill Spotz [EMAIL PROTECTED] wrote: I am the lead developer of PyTrilinos, a python interface to the Trilinos project: http://trilinos.sandia.gov. Trilinos has many packages related to solvers, including ML, the multilevel preconditioning package. There may be a little bit of a learning curve, but there are example scripts to look at. I also built in quite a bit of compatibility with numpy. On Apr 7, 2008, at 8:14 AM, lan haiping wrote: Dear Guys @list : I wanna do some application with mulgrid method for electrostatic problems, is there some python package available for my purpose ? ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370Email: [EMAIL PROTECTED] ** ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Hai-Ping Lan Department of Electronics , Peking University , Bejing, 100871 [EMAIL PROTECTED], [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Any multigrid package to recommend ?
On 07/04/2008, lan haiping [EMAIL PROTECTED] wrote: Dear Guys @list : I wanna do some application with mulgrid method for electrostatic problems, is there some python package available for my purpose ? Nathan bell has a mesmerisingly beautiful webpage at http://graphics.cs.uiuc.edu/~wnbell/ At the bottom, he mentions PyAMG. Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On 07/04/2008, Gael Varoquaux [EMAIL PROTECTED] wrote: On Mon, Apr 07, 2008 at 05:20:47PM +0200, Stéfan van der Walt wrote: I agree, and I think we can come to some compromise -- maybe a numpy.all namespace, that simply imports all the other subpackages. For the beginner, from numpy.all import * is more confusing than from numpy import * (which is already confusing). I know namespace are good things, but the beginner struggles with them. This is why I used the import * in my above examples. You're only a beginner for a short while, and after that the lack of namespaces really start to bite. I am all in favour of catering for those who are busy learning numpy, but should we do that at the cost of our advanced users? There must be a way to make both groups happy -- any suggestions? Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interaction between Numpy and the nose framework (was : packaging scipy)
On 07/04/2008, Matthieu Brucher [EMAIL PROTECTED] wrote: BTW, I stumbled on something strange with the nose framework. If you from numpy.testing import * in a test file, the nose framework will try to test the testing module by calling every test* method. I just mention it there because I think I'm not the only one to do this for set_package_path, assert_equal, ... I've noticed that behaviour, too. Note, however, that you do not need to use set_package_path and friends with nose; you can instead do a fully qualified import: from numpy.submod.mod import foo Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interaction between Numpy and the nose framework (was : packaging scipy)
Hi, On Mon, Apr 7, 2008 at 4:25 PM, Stéfan van der Walt [EMAIL PROTECTED] wrote: On 07/04/2008, Matthieu Brucher [EMAIL PROTECTED] wrote: BTW, I stumbled on something strange with the nose framework. If you from numpy.testing import * in a test file, the nose framework will try to test the testing module by calling every test* method. I just mention it there because I think I'm not the only one to do this for set_package_path, assert_equal, ... I've noticed that behaviour, too. Note, however, that you do not need to use set_package_path and friends with nose; you can instead do a fully qualified import: from numpy.submod.mod import foo Actually, it was intentional to make the scipy.testing * space a more limited version of the numpy testing space - in particular, set_package_path was often being used unnecessarily (by me among others) and was clearly leading to confusion. You do however have assert_equal and friends with from scipy.testing import *, so I'd strongly recommend you use that in preference to the numpy.testing import within scipy. Best, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 07, 2008 at 06:22:28PM +0200, Stéfan van der Walt wrote: You're only a beginner for a short while, and after that the lack of namespaces really start to bite. I am all in favour of catering for those who are busy learning numpy, but should we do that at the cost of our advanced users? I agree with you. However lowering the bar is a good thing. There must be a way to make both groups happy -- any suggestions? Hum, I am still trying to find ideas. If only from foo.bar import baz didn't import all what is in foo.__init__ !!! By the way, the standard solution to this problem is to use a module called api and not all. That's what people have been doing to solve the problem we are faced with. If we are going to go this way, I suggest we stick to the api convention (eventhought it sucks). Cheers, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 7, 2008 at 9:57 AM, Gael Varoquaux [EMAIL PROTECTED] wrote: On Mon, Apr 07, 2008 at 06:22:28PM +0200, Stéfan van der Walt wrote: You're only a beginner for a short while, and after that the lack of namespaces really start to bite. I am all in favour of catering for those who are busy learning numpy, but should we do that at the cost of our advanced users? I agree with you. However lowering the bar is a good thing. There must be a way to make both groups happy -- any suggestions? Hum, I am still trying to find ideas. If only from foo.bar import baz didn't import all what is in foo.__init__ !!! By the way, the standard solution to this problem is to use a module called api and not all. That's what people have been doing to solve the problem we are faced with. If we are going to go this way, I suggest we stick to the api convention (eventhought it sucks). I prefer 'all' for this since it has the correct meaning. 'api' assuming that one can remember what it means doesn't fit. The 'all' module would not contain the api, at least not the preferred api (in my book at least), but it would contain everything. If from numpy.all import * is really too complicated, which although possible, seems a little disheartening, I suspect it would be easy enough to have a separate module that pulled everything in so that you could use from big_numpy import *. Or, to preserve backward compatibility, make numpy the unsplit namespace and expose the split namespace under a different name, let's say 'np' because that's what I already use as a numpy abbreviation. Then import np would get you just the core np functions (which I imagine we could argue about endlessly) and the various subpackages would be imported separately. 'np' is 'numpy' with some stuff removed: get it? OK, so that's a weak joke, sorry. -- . __ . |-\ . . [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Scipy and Numpy at Nabble Forums
Hello, I registered the Scipy and Numpy mailing lists at the Nabble Web Forums: Scipy http://www.nabble.com/Scipy-User-f33045.html Numpy http://www.nabble.com/Numpy-discussion-f33046.html I still have to import the old emails from the archives. Kind regards and happy communicating, Tim Michelsen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, April 7, 2008 11:16, Timothy Hochberg wrote: If from numpy.all import * is really too complicated, which although possible, seems a little disheartening, I suspect it would be easy enough to have a separate module that pulled everything in so that you could use from big_numpy import *. Or, to preserve backward compatibility, make numpy the unsplit namespace and expose the split namespace under a different name, let's say 'np' because that's what I already use as a numpy abbreviation. Then import np would get you just the core np functions (which I imagine we could argue about endlessly) and the various subpackages would be imported separately. 'np' is 'numpy' with some stuff removed: get it? OK, so that's a weak joke, sorry. May not be the epitome of wit, but not bad. +1 for np package being a minimalist numpy and numpy being bigger. # Steve ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 07, 2008 at 10:16:22AM -0700, Timothy Hochberg wrote: I prefer 'all' for this since it has the correct meaning. 'api' assuming that one can remember what it means doesn't fit. The 'all' module would not contain the api, at least not the preferred api (in my book at least), but it would contain everything. Sure, but everybody does it different. Convention are important, especially in coding. See http://ivory.idyll.org/blog/sep-07/not-sucking for a good argumentation about the point. I agree 100% with the author. Especially the conclusion. If from numpy.all import * is really too complicated, which although possible, seems a little disheartening, How much have you tried forcing Python on people who don't care at all about computers. In my work we spend maybe 2% of our time dealing with computers, and the rest struggling with electronics, optics, lasers, mechanical design... People don't want to have to learn _anything_ about computers. I am not saying they are right, I am however saying that we need to provide easy entry point, from where they can evolve and learn. I suspect it would be easy enough to have a separate module that pulled everything in so that you could use from big_numpy import *. Or, to preserve backward compatibility, make numpy the unsplit namespace and expose the split namespace under a different name, let's say 'np' because that's what I already use as a numpy abbreviation. That's the only solution I see wich would make everybody happy. IMHO the pylab option is quite nice: matplotlib is nice and modular, but pylab has it all. Use whichever you want. Now the difficulty is to find a good name for the latter module/namespace. Cheers, Ga�l ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 7, 2008 at 10:30 AM, Gael Varoquaux [EMAIL PROTECTED] wrote: On Mon, Apr 07, 2008 at 10:16:22AM -0700, Timothy Hochberg wrote: I prefer 'all' for this since it has the correct meaning. 'api' assuming that one can remember what it means doesn't fit. The 'all' module would not contain the api, at least not the preferred api (in my book at least), but it would contain everything. Sure, but everybody does it different. Convention are important, especially in coding. See http://ivory.idyll.org/blog/sep-07/not-sucking for a good argumentation about the point. I agree 100% with the author. Especially the conclusion. This is all moot since we agree below, but I don't see that the page your reference, which seem uncontroversial on a casual reading, is all that relevant. It's not that I disagree, that following convention is important where reasonable, I just don't see that this is a case where there is a convention to follow. I'm at a bit of a disadvantage since the convention in question hasn't penetrated the parts of of Python land that I inhabit (which could either imply something about my experience or about the universality of the 'api' convention, take your pick). However, I think that I vaguely recall it from back in my C-programming days, and as I recall/infer/guess the 'api' namespace is how you are supposed to use the functions in question, while the actual modules are split out for implementation purposes only. However, in numpy, that is not the case. Any splitting of the namespace would be to support a better, more organized interface, not as an implementation details. So. referring to the collected, flat namespace as 'api' would be confusing at best since it would imply that was the official approved way to access the python functions rather than one of two equivalent apis, where the flat namespace is provided primarily for beginners. If from numpy.all import * is really too complicated, which although possible, seems a little disheartening, How much have you tried forcing Python on people who don't care at all about computers. Almost none, thankfully. In my work we spend maybe 2% of our time dealing with computers, and the rest struggling with electronics, optics, lasers, mechanical design... People don't want to have to learn _anything_ about computers. I am not saying they are right, I am however saying that we need to provide easy entry point, from where they can evolve and learn. I suspect it would be easy enough to have a separate module that pulled everything in so that you could use from big_numpy import *. Or, to preserve backward compatibility, make numpy the unsplit namespace and expose the split namespace under a different name, let's say 'np' because that's what I already use as a numpy abbreviation. That's the only solution I see wich would make everybody happy. IMHO the pylab option is quite nice: matplotlib is nice and modular, but pylab has it all. Use whichever you want. Now the difficulty is to find a good name for the latter module/namespace. Cheers, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -- . __ . |-\ . . [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
Steven H. Rogers wrote: On Mon, April 7, 2008 11:16, Timothy Hochberg wrote: If from numpy.all import * is really too complicated, which although possible, seems a little disheartening, I suspect it would be easy enough to have a separate module that pulled everything in so that you could use from big_numpy import *. Or, to preserve backward compatibility, make numpy the unsplit namespace and expose the split namespace under a different name, let's say 'np' because that's what I already use as a numpy abbreviation. Then import np would get you just the core np functions (which I imagine we could argue about endlessly) and the various subpackages would be imported separately. 'np' is 'numpy' with some stuff removed: get it? OK, so that's a weak joke, sorry. May not be the epitome of wit, but not bad. +1 for np package being a minimalist numpy and numpy being bigger. # Steve ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion Hi, I think that splitting the NumPy namespace should not happen within a major release series because it would cause too many breakages. Rather it should be in a forthcoming release like the 2.0 series where it may be very feasible to have a true core functionality (NumPy), extended functionality (SciPy) and specific applications (Scikits). At the same time, Python 3K would be fully supported because it will break lots of code. It is really nice to think about having NumPy Core, NumPy Full, NumPyKits, SciPy Core, SciPy Full and SciPyKits. But splitting namespaces like core and complete brings into the problem of conflicts and how to resolve them. Regardless of the content of each, I have the suspicion that most people would just take the full versions of each eventhough most of them only use a very small fraction of NumPy (just probably different amongst users). In the past, the real distinction between Numpy and SciPy for me was the requirement of having a full Lapack installation and a Fortran compiler for SciPy. This made the current scheme very usable especially the frustrations of getting SciPy to install. Fortunately Linux and GCC Fortran has really developed over the years that these are not as big as they were although these still cause issues (especially if packages are broken). However, it remains a big concern if everything has to be built from scratch (perhaps with different compilers) or if developers continue to tell users to get the latest version from svn (problem if you used a precompiled version). Regards Bruce ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 7, 2008 at 11:19 AM, Bruce Southey [EMAIL PROTECTED] wrote: Hi, I think that splitting the NumPy namespace should not happen within a major release series because it would cause too many breakages. Rather it should be in a forthcoming release like the 2.0 series where it may be very feasible to have a true core functionality (NumPy), extended functionality (SciPy) and specific applications (Scikits). At the same time, Python 3K would be fully supported because it will break lots of code. I would prefer not to do it at all. We've just gotten people moved over from Numeric; I'd hate to break their trust again. It is really nice to think about having NumPy Core, NumPy Full, NumPyKits, SciPy Core, SciPy Full and SciPyKits. Really? It gives me the shivers, frankly. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 07, 2008 at 11:29:41AM -0700, Robert Kern wrote: I would prefer not to do it at all. We've just gotten people moved over from Numeric; I'd hate to break their trust again. +1. Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 7, 2008 at 11:29 AM, Gael Varoquaux [EMAIL PROTECTED] wrote: On Mon, Apr 07, 2008 at 11:29:41AM -0700, Robert Kern wrote: I would prefer not to do it at all. We've just gotten people moved over from Numeric; I'd hate to break their trust again. +1 I also think we have a big enough proliferation of namespaces (with numpy, scipy, and scikits). -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Montag 07 April 2008, Robert Kern wrote: I would prefer not to do it at all. We've just gotten people moved over from Numeric; I'd hate to break their trust again. +1. IMO, numpy has arrived at a state where there's just enough namespace clutter to allow most use cases to get by without importing much sub-namespace junk, and I think that's a good place to be (and to stay). For now, I'd be very careful about adding more. It is really nice to think about having NumPy Core, NumPy Full, NumPyKits, SciPy Core, SciPy Full and SciPyKits. Really? It gives me the shivers, frankly. Couldn't agree more. Andreas signature.asc Description: This is a digitally signed message part. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 7, 2008 at 11:02 AM, Timothy Hochberg [EMAIL PROTECTED] wrote: I'm at a bit of a disadvantage since the convention in question hasn't penetrated the parts of of Python land that I inhabit (which could either imply something about my experience or about the universality of the 'api' convention, take your pick). However, I think that I vaguely recall it from back in my C-programming days, and as I recall/infer/guess the 'api' namespace is how you are supposed to use the functions in question, while the actual modules are split out for implementation purposes only. I haven't been following how many projects have been using the api.py convention, but when I last looked about a year ago there was enthought, peak, zope, trac, etc. See this note for a bit more information: http://neuroimaging.scipy.org/neuroimaging/ni/ticket/86 Hope this helps, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] site.cfg doesnt function?
I checked out numpy from svn few hours ago, and created a site.cfg following site.cfg.example. During the build process I am getting an warning that unoptimized lapack in being used. Machine: dual core amd64 running gentoo linux. Relevant packages: python 2.5.1, blas-atlas-3.8.0, lapack-atlas-3.8.0 # site.cfg [ALL] library_dirs = /usr/lib64/lapack/atlas:/usr/lib64/blas/threaded-atlas:/usr/lib include_dirs = /usr/include/atlas:/usr/include [blas_opt] library_dirs = /usr/lib64/blas/threaded-atlas:/usr/lib64 libraries = blas, cblas, atlas [lapack_opt] library_dirs = /usr/lib64/lapack/atlas:/usr/lib64 libraries = lapack, blas, cblas, atlas [fftw] libraries = fftw3 I added the following print lines in system_info class: def __init__ (self, default_lib_dirs=default_lib_dirs, default_include_dirs=default_include_dirs, verbosity = 1, ): print '\n\n=' print 'class: ',self.__class__ print ' libs: ', default_lib_dirs print ' includes: ', default_include_dirs print '=\n\n' A partial dump out of python setup.py build: Running from numpy source directory. F2PY Version 2_4971 = class: numpy.distutils.system_info.blas_opt_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = blas_opt_info: = class: numpy.distutils.system_info.blas_mkl_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = blas_mkl_info: libraries mkl,vml,guide not found in /usr/lib libraries mkl,vml,guide not found in /usr/local/lib NOT AVAILABLE = class: numpy.distutils.system_info.atlas_blas_threads_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = atlas_blas_threads_info: Setting PTATLAS=ATLAS NOT AVAILABLE = class: numpy.distutils.system_info.atlas_blas_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = atlas_blas_info: NOT AVAILABLE /home/nadav/numpy/numpy/distutils/system_info.py:1345: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) = class: numpy.distutils.system_info.blas_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = blas_info: NOT AVAILABLE /home/nadav/numpy/numpy/distutils/system_info.py:1354: UserWarning: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. warnings.warn(BlasNotFoundError.__doc__) = class: numpy.distutils.system_info.blas_src_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = blas_src_info: NOT AVAILABLE /home/nadav/numpy/numpy/distutils/system_info.py:1357: UserWarning: Blas (http://www.netlib.org/blas/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [blas_src]) or by setting the BLAS_SRC environment variable. warnings.warn(BlasSrcNotFoundError.__doc__) NOT AVAILABLE = class: numpy.distutils.system_info.lapack_opt_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = lapack_opt_info: = class: numpy.distutils.system_info.lapack_mkl_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = lapack_mkl_info: = class: numpy.distutils.system_info.mkl_info libs: ['/usr/local/lib', '/usr/lib'] includes: ['/usr/local/include', '/usr/include'] = mkl_info: libraries mkl,vml,guide not found in /usr/lib libraries mkl,vml,guide not found in /usr/local/lib NOT AVAILABLE NOT AVAILABLE = class: numpy.distutils.system_info.atlas_threads_info libs:
Re: [Numpy-discussion] site.cfg doesnt function?
Hi Nadav, On Montag 07 April 2008, Nadav Horesh wrote: [snip] Try something like this: [atlas] library_dirs = /users/kloeckner/mach/x86_64/pool/lib,/usr/lib atlas_libs = lapack, f77blas, cblas, atlas Andreas signature.asc Description: This is a digitally signed message part. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram
+1 for axis and +1 for a keyword to define what to do with values outside the range. For the keyword, ather than 'outliers', I would propose 'discard' or 'exclude', because it could be used to describe the four possibilities : - discard='low' = values lower than the range are discarded, values higher are added to the last bin - discard='up' = values higher than the range are discarded, values lower are added to the first bin - discard='out' = values out of the range are discarded - discard=None= values outside of this range are allocated to the closest bin For the default behavior, most of the case, the sum of the bins 's population should be equal to the size of the original one for me, so I would prefer discard=None. But I'm also okay with discard='low' in order not to break older code, if this is clearly stated. -- LB ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On 07/04/2008, Andreas Klöckner [EMAIL PROTECTED] wrote: On Montag 07 April 2008, Robert Kern wrote: I would prefer not to do it at all. We've just gotten people moved over from Numeric; I'd hate to break their trust again. +1. IMO, numpy has arrived at a state where there's just enough namespace clutter to allow most use cases to get by without importing much sub-namespace junk, and I think that's a good place to be (and to stay). I wouldn't exactly call 494 functions just enough namespace clutter; I'd much prefer to have a clean api to work with. I certainly don't propose forcing such an api upon all users, but it should be available as an option, at least. Tim's suggestion for a separate package that pulls in a structured numpy would suit my needs. As Gael mentioned, __init__'s are cursed, otherwise we'd be able to provide numpy.* for the flat earth society (all in friendly jest ;) and numpy.api to expose a somewhat organised underlying structure. As it is, importing numpy.api would trigger the __init__ of the flat namespace as well; but I'd still be amenable to this solution since the import doesn't take long, and the organisation of the api is more important to me. Would it therefore make sense to a) Reorganise numpy to expose functionality as numpy.api.* b) Do a series of imports in numpy.__init__ which pulls in from numpy.api. This way, numpy.* would look exactly as it does now, bar the added member 'api'. Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 07, 2008 at 10:16:57PM +0200, Stéfan van der Walt wrote: Would it therefore make sense to a) Reorganise numpy to expose functionality as numpy.api.* b) Do a series of imports in numpy.__init__ which pulls in from numpy.api. This way, numpy.* would look exactly as it does now, bar the added member 'api'. +1. That way you don't break compatibility, but you provide nested namespace for people interested in them. You still get the import overhead. That's too bad. With some very good engineering, you might even make it possible to ship only part of numpy for custom installations. Cheers, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram
On Apr 7, 2008, at 4:14 PM, LB wrote: +1 for axis and +1 for a keyword to define what to do with values outside the range. For the keyword, ather than 'outliers', I would propose 'discard' or 'exclude', because it could be used to describe the four possibilities : - discard='low' = values lower than the range are discarded, values higher are added to the last bin - discard='up' = values higher than the range are discarded, values lower are added to the first bin - discard='out' = values out of the range are discarded - discard=None= values outside of this range are allocated to the closest bin For the default behavior, most of the case, the sum of the bins 's population should be equal to the size of the original one for me, so I would prefer discard=None. But I'm also okay with discard='low' in order not to break older code, if this is clearly stated. It seems that people in this discussion are forgetting that the bins are actually defined by the lower boundaries supplied, such that bins = [1,3,5] actually currently means bin1 - 1 to 2.9... bin2 - 3 to 4.9... bin3 - 5 to inf (of course in version 1.0.1 the documentation is inconsistent with the behavior as described by the original poster). This definition of bins makes it hard to exclude values as it forces the user to give an extra value in the bin definition, i.e. the bins statement above only give two bins, while supplying three values. That seems confusing to me. I am not sure what the right approach is, but currently using range will clip the values outside the number the user wants. Cheers Tommy ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
I wouldn't exactly call 494 functions just enough namespace clutter; I'd much prefer to have a clean api to work with I don't know. The 494 functions do not seem like many to me. Apparently, I tend to come down in the flat earth society although I do like some structure (after all that's why numpy has numpy.linalg and numpy.fft). I don't think this is the most pressing issue we are facing. Would it therefore make sense to a) Reorganise numpy to expose functionality as numpy.api.* b) Do a series of imports in numpy.__init__ which pulls in from numpy.api. This way, numpy.* would look exactly as it does now, bar the added member 'api'. This discussion is interesting, but it is really better suited for 1.1, isn't it? -0 on adding the .api name in 1.0.5 -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Back to Simple financial functions for NumPy (was Re: packaging scipy)
On Mon, Apr 07, 2008 at 03:52:45PM -0500, Travis E. Oliphant wrote: This discussion is interesting, but it is really better suited for 1.1, isn't it? Yes. The original dicussion was about adding simple financial functions for Numpy. I think the functions you wrote should land somewhere. We may disagree on where, and you should choose based on feedback here and your personnal feeling, but I hope they will indeed be published somewhere. Cheers, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Montag 07 April 2008, Stéfan van der Walt wrote: I wouldn't exactly call 494 functions just enough namespace clutter; I'd much prefer to have a clean api to work with. Not to bicker, but... import numpy len(dir(numpy)) 494 numpy.__version__ '1.0.4' funcs = [s for s in dir(numpy) if type(getattr(numpy, s)) in [type(numpy.array), type(numpy.who)]] len(funcs) 251 classes = [s for s in dir(numpy) if type(getattr(numpy, s)) == type(numpy.ndarray)] len(classes) 88 ufuncs = [s for s in dir(numpy) if type(getattr(numpy, s)) == type(numpy.sin)] len(ufuncs) 69 (and, therefore, another 69 names of fluff) I honestly don't see much of a problem. The only things that maybe should not have been added to numpy.* are the polynomial functions and the convolution windows, conceptually. But in my book that's not big enough to even think of breaking people's code for. Andreas Proud Member of the Flat Earth Society signature.asc Description: This is a digitally signed message part. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On 07/04/2008, Travis E. Oliphant [EMAIL PROTECTED] wrote: Would it therefore make sense to a) Reorganise numpy to expose functionality as numpy.api.* b) Do a series of imports in numpy.__init__ which pulls in from numpy.api. This way, numpy.* would look exactly as it does now, bar the added member 'api'. This discussion is interesting, but it is really better suited for 1.1, isn't it? Certainly -- this suggestion is aimed at 1.1 (along with the discussion we had with Anne on API refactoring). -0 on adding the .api name in 1.0.5 Aargh, the +- zeros again! I thought we closed that ticket :) Cheers Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On 07/04/2008, Andreas Klöckner [EMAIL PROTECTED] wrote: On Montag 07 April 2008, Stéfan van der Walt wrote: I wouldn't exactly call 494 functions just enough namespace clutter; I'd much prefer to have a clean api to work with. Not to bicker, but... import numpy len(dir(numpy)) 494 You'd be glad to know that that your investment increased as of r4964: 494 - 504 251 - 258 88 - 89 69 - 71 I honestly don't see much of a problem. I see at least two: a) These numbers are growing (albeit slowly) and b) numpy.TAB under IPython shows 516 completions It doesn't matter *what* these completions are -- they're still there. Sifting through 500 options isn't fun -- not for a newbie, nor a salty old sailor. I agree with Joe that the problem can be ameliorated by documentation, but I do think that an (optional) fundamental restructuring is ultimately useful. Proud Member of the Flat Earth Society :) Cheers Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 07, 2008 at 11:21:41PM +0200, Stéfan van der Walt wrote: It doesn't matter *what* these completions are -- they're still there. Sifting through 500 options isn't fun I get more than that when I tab in an empty shell on my box. :-} Why do you expect to be able to inspect a module of the size of numpy with tab-completion. I agree that for this usecase (which is more a question of discovering/exploring the API than using it) a nested namespace is better. And as I think this usecase is valid, this is why I like the proposition of having an api submodule, with a nested namespace. My two cents, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 7, 2008 at 3:21 PM, Stéfan van der Walt [EMAIL PROTECTED] wrote: On 07/04/2008, Andreas Klöckner [EMAIL PROTECTED] wrote: On Montag 07 April 2008, Stéfan van der Walt wrote: I wouldn't exactly call 494 functions just enough namespace clutter; I'd much prefer to have a clean api to work with. Not to bicker, but... import numpy len(dir(numpy)) 494 You'd be glad to know that that your investment increased as of r4964: 494 - 504 251 - 258 88 - 89 69 - 71 I honestly don't see much of a problem. I see at least two: a) These numbers are growing (albeit slowly) and b) numpy.TAB under IPython shows 516 completions It doesn't matter *what* these completions are -- they're still there. Sifting through 500 options isn't fun -- not for a newbie, nor a salty old sailor. I agree with Joe that the problem can be ameliorated by documentation, but I do think that an (optional) fundamental restructuring is ultimately useful. Yeah, dir needs to print in two columns ;) I think we could use an apropos sort of function that indexes the documentation, making it easier to find relevant functions. Apart from that, I think we should stop adding to the numpy namespace. Polynomials, financial functions, image processing, all that is nice to have around, but I don't think it belongs in the top level. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Apr 7, 2008, at 2:29 PM, Robert Kern wrote: On Mon, Apr 7, 2008 at 11:19 AM, Bruce Southey [EMAIL PROTECTED] wrote: Hi, I think that splitting the NumPy namespace should not happen within a major release series because it would cause too many breakages. Rather it should be in a forthcoming release like the 2.0 series where it may be very feasible to have a true core functionality (NumPy), extended functionality (SciPy) and specific applications (Scikits). At the same time, Python 3K would be fully supported because it will break lots of code. I would prefer not to do it at all. We've just gotten people moved over from Numeric; I'd hate to break their trust again. Amen. It is really nice to think about having NumPy Core, NumPy Full, NumPyKits, SciPy Core, SciPy Full and SciPyKits. Really? It gives me the shivers, frankly. Me too. Some random comments: 1) It seems to me that the primary problem people have with a big flat namespace is that it makes the output of dir() long and unusable, and by implication, that a nice hierarchical organization would make it easier for people to find stuff. As to the latter, less so than you might think. From what I've seen, there is no generally agreed upon organization that all will agree to or find intuitive. There will always be functions that logically belong to more than one category. Ultimately, this is why flatter is better as far as that goes. If one wants to find things by category, we would be much better off tagging functions with categories and then building some dir-like tool that helps display things in that category. Some have already alluded to that (Joe Harrington I believe). The only thing namespaces solve is name collisions imho. I don't believe that the current numpy has too many names in its basic namespace, and it already has split out some things into subpackages (fft, random, linear algebra) that have such a potential. 2) Some may feel that the combination of from numpy import * with a lot of names makes it hard to see other things in your namespace. True enough. But no amount of winnowing is going to keep the namespace at a point that isn't going to overwhelm everything else. The answer is simple in that case. If that's a problem for you, don't use from numpy import *. Or perhaps another dir-like tool that filters out all numpy/scipy/pylab... items. 3) Some don't like the bloat (in disk space or download sizes) of adding things to numpy. In my case, as long as the addition doesn't make installations any more difficult I don't care. For the great majority, the current size or anything within an order of magnitude is not an important issue. For the 56Kb modem people, perhaps we can construct a numpy-lite, but it shouldn't be the standard distribution. I don't mind the financial functions going into numpy. I think it's a good idea since a lot of people may find that very handy to be part of the core distribution, probably many more than worry about more exotic packages, and likely many more than care about fft, random and linear algebra. 4) The api interface is perhaps a good idea, but as Travis mentions can be deferred. But I don't think I would need it. (see 1) Perry ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
3) Some don't like the bloat (in disk space or download sizes) of adding things to numpy. In my case, as long as the addition doesn't make installations any more difficult I don't care. For the great majority, the current size or anything within an order of magnitude is not an important issue. For the 56Kb modem people, perhaps we can construct a numpy-lite, but it shouldn't be the standard distribution. I don't mind the financial functions going into numpy. I think it's a good idea since a lot of people may find that very handy to be part of the core distribution, probably many more than worry about more exotic packages, and likely many more than care about fft, random and linear algebra. The only problem is that if we keep adding things to numpy that could be in scipy, it will _never_ be clear to users where they can expect to find things. It is already bad enough. How do I explain to a user/student/scientist that ffts and linear algebra are in numpy, but that integration and interpolation are in scipy. That doesn't make any sense to them. Oh but wait, linear algebra and ffts are also in scipy! Random numbers - take a guess - wrong, they are in numpy. As far as I am concerned, financial fucntions are completely outside the conceptual scope that numpy has established = arrays, fft, linalg, random. In fact, they are far outside it. Simply putting things into numpy because of convenience (numpy is easier to install) only encourages people to never install or use scipy. If scipy that much of a pain to install and use - we should spend our time improving scipy. Cheers, Brian ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Apr 7, 2008, at 5:54 PM, Brian Granger wrote: The only problem is that if we keep adding things to numpy that could be in scipy, it will _never_ be clear to users where they can expect to find things. It is already bad enough. How do I explain to a user/student/scientist that ffts and linear algebra are in numpy, but that integration and interpolation are in scipy. That doesn't make any sense to them. Oh but wait, linear algebra and ffts are also in scipy! Random numbers - take a guess - wrong, they are in numpy. As far as I am concerned, financial fucntions are completely outside the conceptual scope that numpy has established = arrays, fft, linalg, random. In fact, they are far outside it. Simply putting things into numpy because of convenience (numpy is easier to install) only encourages people to never install or use scipy. If scipy that much of a pain to install and use - we should spend our time improving scipy. Cheers, Brian To me, the biggest characteristic difference between the two is the ease of installation. If installation weren't an issue, I would tell everyone to use scipy and then the confusion would be ended. But the installation issue is not trivial one to solve (if it were, we'd already be there). But a nice ideal is that the numpy namespace should map directly into scipy's so that if I expected numpy.xxx to work, the scipy.xxx should also work. That would lessen the confusion of finding things. If it isn't in numpy, it's in scipy. But otherwise one is a subset of the other. (I say this from complete ignorance, I'm not sure what prevents this, and there may be very good reasons why this can't be done). Perry ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] packaging scipy (was Re: Simple financial functions for NumPy)
On Mon, Apr 7, 2008 at 4:03 PM, Perry Greenfield [EMAIL PROTECTED] wrote: On Apr 7, 2008, at 5:54 PM, Brian Granger wrote: The only problem is that if we keep adding things to numpy that could be in scipy, it will _never_ be clear to users where they can expect to find things. It is already bad enough. How do I explain to a user/student/scientist that ffts and linear algebra are in numpy, but that integration and interpolation are in scipy. That doesn't make any sense to them. Oh but wait, linear algebra and ffts are also in scipy! Random numbers - take a guess - wrong, they are in numpy. As far as I am concerned, financial fucntions are completely outside the conceptual scope that numpy has established = arrays, fft, linalg, random. In fact, they are far outside it. Simply putting things into numpy because of convenience (numpy is easier to install) only encourages people to never install or use scipy. If scipy that much of a pain to install and use - we should spend our time improving scipy. Cheers, Brian To me, the biggest characteristic difference between the two is the ease of installation. If installation weren't an issue, I would tell everyone to use scipy and then the confusion would be ended. But the installation issue is not trivial one to solve (if it were, we'd already be there). I definitely understand the installation issue. Is the main thing people run into the fortran compiler, BLAS and LAPACK? To me it seems like the scipy install is pretty simple these days. Do we need better installation documentation? Deep down I so wish we could ditch fortran! In almost all cases I know of projects that have fortran involved are the worse for it. But a nice ideal is that the numpy namespace should map directly into scipy's so that if I expected numpy.xxx to work, the scipy.xxx should also work. That would lessen the confusion of finding things. If it isn't in numpy, it's in scipy. But otherwise one is a subset of the other. (I say this from complete ignorance, I'm not sure what prevents this, and there may be very good reasons why this can't be done). I think this is a very good idea to follow, at least for things that do happen to in both places. But, I still don't think it we should have many things that are in both places. Brian Perry ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Tests 32/64 bits
Hi all, here's the output difference between 32/64 bit tests run: bic128[scipy]$ diff -u tests_32.txt tests_64.txt --- tests_32.txt2008-04-07 15:54:29.0 -0700 +++ tests_64.txt2008-04-07 15:53:58.0 -0700 @@ -611,12 +611,6 @@ testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok testip_zero (numpy.tests.test_linalg.TestMatrixPower) ... ok testip_zero (numpy.tests.test_linalg.TestMatrixPower) ... ok testip_zero (numpy.tests.test_linalg.TestMatrixPower) ... ok On 32 bits, 869 are found and on Fedora x86_64, it's 863. Above is the difference (requested by rkern). Gotta run... f ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests 32/64 bits
On Mon, Apr 7, 2008 at 3:56 PM, Fernando Perez [EMAIL PROTECTED] wrote: Hi all, here's the output difference between 32/64 bit tests run: bic128[scipy]$ diff -u tests_32.txt tests_64.txt --- tests_32.txt2008-04-07 15:54:29.0 -0700 +++ tests_64.txt2008-04-07 15:53:58.0 -0700 @@ -611,12 +611,6 @@ testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok -testip_types (numpy.core.tests.test_multiarray.TestPutmask) ... ok testip_zero (numpy.tests.test_linalg.TestMatrixPower) ... ok testip_zero (numpy.tests.test_linalg.TestMatrixPower) ... ok testip_zero (numpy.tests.test_linalg.TestMatrixPower) ... ok On 32 bits, 869 are found and on Fedora x86_64, it's 863. Above is the difference (requested by rkern). I think this is fine. The different arises because of extra scalar types on the 32 bit system that don't show up on the AMD system, presumably float96 and complex192. Check numpy.sctypes for the difference. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Spyke : moving forward
Thanks everyone for suggestions and enthusiasm. 1. For type declarations moving from string based annotations to decorators for functions. (Function annotations in 3k: Well it should be easy to add them once that feature comes out.) 2. License : Keeping it as GPLv3 for compiler. The runtime is just a python script that invokes the real compiler binary and can be licensed under LGPL or BSD if thats what people prefer. 3. Release : Will release source+binary in 2-3 weeks. Need to get some stuff sorted at uni. Please be patient :) 4. Will establish a testsuite at google code in a couple of days. Everyone is encouraged to contribute test cases whether big or small. Having a proper test suite will mean we can better track the bugs and features in the compiler on a daily basis. Are there any suggestions on how the test suite should be organized? I want a test suite where we have lets say N tests and a script runs all N tests and compares the expected output and says pass/fail for each one of them. Also what license is suitable for testsuite? The code will remain 100% pure Python so I believe any license can be chosen. 5. Interop with C : I am thinking of a module which wraps the functionality of ctypes with some added type declarations. But this wont work anytime soon. 6 Invoking compiler at runtime instead of as a standalone compiler : I had not thought of invoking Spyke at runtime through the decorator. So currently Spyke is invoked standalone from the commandline. But now that we are adding decorators, as suggested by Travis and others I will look into how to invoke compiler directly from the decorator. 7. Support for classes : Basic support for classes but the code generated currently for classes is pretty slow. No support for exceptions currently. 8. Long term plans : I intend to use Spyke as a platform for some research into compiler optimizations. From time to time I may play with some things but those will be kept out of the trunk. thanks, rahul ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Spyke : moving forward
On Mon, Apr 7, 2008 at 5:28 PM, Rahul Garg [EMAIL PROTECTED] wrote: 2. License : Keeping it as GPLv3 for compiler. The runtime is just a python script that invokes the real compiler binary and can be licensed under LGPL or BSD if thats what people prefer. Can you clarify this? That is not what I would have called a runtime. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests 32/64 bits
On Mon, Apr 7, 2008 at 4:04 PM, Robert Kern [EMAIL PROTECTED] wrote: On 32 bits, 869 are found and on Fedora x86_64, it's 863. Above is the difference (requested by rkern). I think this is fine. The different arises because of extra scalar types on the 32 bit system that don't show up on the AMD system, presumably float96 and complex192. Check numpy.sctypes for the difference. - complex192 isn't on the 64bit box, but complex256 is, so that keeps the number of tests for that type equal. - float96 - float128, again no change in test count - for some reason, the 32-bit box gives 'int': [type 'numpy.int8', type 'numpy.int16', type 'numpy.int32', type 'numpy.int32', type 'numpy.int64'], so there's a repeated int32 type listed there. I don't know what that means, but obviously it produces extra tests (possibly redundant?) - Same for uint: 'uint': [type 'numpy.uint8', type 'numpy.uint16', type 'numpy.uint32', type 'numpy.uint32', type 'numpy.uint64']} In any case, other than this minor difference, current numpy SVN passes all tests on Fedora8/64bit and Ubuntu Gutsy/32bit. Cheers, f ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Tests 32/64 bits
On Mon, Apr 7, 2008 at 6:29 PM, Fernando Perez [EMAIL PROTECTED] wrote: On Mon, Apr 7, 2008 at 4:04 PM, Robert Kern [EMAIL PROTECTED] wrote: On 32 bits, 869 are found and on Fedora x86_64, it's 863. Above is the difference (requested by rkern). I think this is fine. The different arises because of extra scalar types on the 32 bit system that don't show up on the AMD system, presumably float96 and complex192. Check numpy.sctypes for the difference. - complex192 isn't on the 64bit box, but complex256 is, so that keeps the number of tests for that type equal. - float96 - float128, again no change in test count - for some reason, the 32-bit box gives 'int': [type 'numpy.int8', type 'numpy.int16', type 'numpy.int32', type 'numpy.int32', type 'numpy.int64'], so there's a repeated int32 type listed there. I don't know what that means, but obviously it produces extra tests (possibly redundant?) Definitely redundant, but harmless. The code that generates these lists is numpy/core/numerictypes.py:_set_array_types(). The redundant ones are dtype('p').type and dtype('P').type. For some reason these do not compare equal to numpy.{u}int32. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram
On Apr 7, 2008, at 4:14 PM, LB wrote: +1 for axis and +1 for a keyword to define what to do with values outside the range. For the keyword, ather than 'outliers', I would propose 'discard' or 'exclude', because it could be used to describe the four possibilities : - discard='low' = values lower than the range are discarded, values higher are added to the last bin - discard='up' = values higher than the range are discarded, values lower are added to the first bin - discard='out' = values out of the range are discarded - discard=None= values outside of this range are allocated to the closest bin Suppose you set bins=5, range=[0,10], discard=None, should the returned bins be [0,2,4,6,810] or [-inf, 2, 4, 6, 8, inf] ? Now suppose normed=True, what should be the density for the first and last bin ? It seems to me it should be zero since we are assuming that the bins extend to -infinity and infinity, but then, taking the outliers into account seems pretty useless. Overall, I think discard is a confusing option with little added value. Getting the outliers is simply a matter of defining the bin edges explictly, ie [-inf, x0, x1, ..., xn, inf]. In any case, attached is a version of histogram implementing the axis and discard keywords. I'd really prefer though if we dumped the discard option. David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion def histogram(a, bins=10, range=None, normed=False, discard='out', axis=None): Compute the histogram from a set of data. Parameters: a : array The data to histogram. bins : int or sequence of floats If an int, then the number of equal-width bins in the given range. Otherwise, a sequence of the lower bound of each bin. range : (float, float) The lower and upper range of the bins. If not provided, then (a.min(), a.max()) is used. Values outside of this range are allocated according to the discard keyword. normed : bool If False, the result array will contain the number of samples in each bin. If True, the result array is the value of the probability *density* function at the bin normalized such that the *integral* over the range is 1. Note that the sum of all of the histogram values will not usually be 1; it is not a probability *mass* function. discard : out, low, high, None With out, values outside range are not tallied, using low (high), values lower (greater) than range are discarded, and values higher (lower) than range are tallied in the closest bin. Using None, values outside of range are stored in the closest bin. axis : None or int Axis along which histogram is performed. If None, applies on the entire array. Returns: hist : array The values of the histogram. See `normed` for a description of the possible semantics. edges : float array The bins edges. SeeAlso: histogramdd a = asarray(a).ravel() if (range is not None): mn, mx = range if (mn mx): raise AttributeError, 'max must be larger than min in range parameter.' if not iterable(bins): if range is None: range = (a.min(), a.max()) mn, mx = [mi+0.0 for mi in range] if mn == mx: mn -= 0.5 mx += 0.5 bins = linspace(mn, mx, bins+1, endpoint=True) else: bins = asarray(bins) if (np.diff(bins) 0).any(): raise AttributeError, 'bins must increase monotonically.' if discard is None: bins = np.r_[-np.inf, bins[1:-1], np.inf] elif discard == 'low': bins = np.r_[bins[:-1], np.inf] elif discard == 'high': bins = np.r_[-np.inf, bins[1:]] elif discard == 'out': pass else: raise ValueError, 'discard keyword not in None, out, high, low : %s'%discard if axis is None: return histogram1d(a.ravel(), bins, normed), bins else: return np.apply_along_axis(histogram1d, axis, a, bins, normed), bins def histogram1d(a, bins, normed): Internal usage function to compute an histogram on a 1D array. Parameters: a : array The data to histogram. bins : sequence The edges of the bins. normed : bool If false, return the number of samples falling into each bin. If true, return the density of the sample in each bin. # best block size probably depends on processor cache size block = 65536 n = np.zeros(bins.shape, int) for i in xrange(0, a.size, block): sa =
Re: [Numpy-discussion] site.cfg doesnt function?
I tried: [ALL] library_dirs = /usr/lib64/lapack/atlas:/usr/lib64/blas/threaded-atlas:/usr/lib include_dirs = /usr/include/atlas:/usr/include [blas_opt] library_dirs = /usr/lib64/blas/threaded-atlas:/usr/lib64 libraries = blas, cblas, atlas [lapack_opt] library_dirs = /usr/lib64/lapack/atlas:/usr/lib64 libraries = lapack, blas, cblas, atlas [fftw] libraries = fftw3 [atlas] library_dirs = /usr/lib64/lapack/atlas:/usr/lib64/blas/threaded-atlas:/usr/lib include_dirs = /usr/include/atlas:/usr/include libraries = lapack, blas, cblas, atlas but it did not change anything. any ideas? Nadav. -הודעה מקורית- מאת: [EMAIL PROTECTED] בשם Andreas Kl?ckner נשלח: ב 07-אפריל-08 21:56 אל: Discussion of Numerical Python נושא: Re: [Numpy-discussion] site.cfg doesnt function? Hi Nadav, On Montag 07 April 2008, Nadav Horesh wrote: [snip] Try something like this: [atlas] library_dirs = /users/kloeckner/mach/x86_64/pool/lib,/usr/lib atlas_libs = lapack, f77blas, cblas, atlas Andreas winmail.dat___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] site.cfg doesnt function?
Nadav, [ALL] library_dirs = /usr/lib64/lapack/atlas:/usr/lib64/blas/threaded-atlas:/usr/lib include_dirs = /usr/include/atlas:/usr/include I believe (contrary to my 'unix' intuition) that you should replace the colons by commas. ie: include_dirs = /usr/include/atlas,/usr/include library_dirs = /foo/path,/foo/path2 Cheers, Sebastien. -- ### # Dr. Sebastien Binet # # Lawrence Berkeley National Lab. # # 1 Cyclotron Road# # Berkeley, CA 94720 # ### signature.asc Description: This is a digitally signed message part. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion