Re: [Numpy-discussion] Generalized ufuncs?
Numpy 1.2 is for documentation, bug fixes, and getting the new testing framework in place. Discipline is called for if we are going to have timely releases. We also agreed to a change in the C-API (or at least did not object too loudly). I'm in favor of minimizing that sort of change. Why not wait until after the release then? The biggest reason is that the patch requires changing the C-API and we are already doing that for 1.2. I would rather not do it again for another 6 months at least. I don't think we should make the patch wait that long. Your code review is very much appreciated. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
On Thu, Aug 14, 2008 at 10:54 PM, Charles R Harris [EMAIL PROTECTED] wrote: Numpy 1.2 is for documentation, bug fixes, and getting the new testing framework in place. Discipline is called for if we are going to have timely releases. First, all your points are very valid. And I apologize for the role I played in this. Thanks for calling us on it. That said while you are correct that this release is mainly about documentation, bug fixes, and getting the new testing framework in place, there are several other things that have gone in. Their have been a few planned API changes and even a C-API change. Travis emailed me asking where we were on the beta release and whether we should discuss including this change on the list. I contacted Stefan and asked him if he could do me a huge favor and see if we could quickly apply the patch before making the beta release. My reasoning was that this looked very good and useful and just offered something new. Stefan was hesitant, but I persisted. He didn't like that it didn't have any tests, but I said if he put it in time for the beta he could add tests afterward. I wanted to make sure no new features got in after a beta. Also we are all ready requiring recompiling with this release, so I thought now would be a good time to add it. We is the numpy community, not you and Travis. Absolutely. There were several of us involved, not just Travis and Stefan. But that is no excuse. Stefan, David, Chris, and I have been trying very hard to get the beta out over the last few days and had started talking among ourselves since we were mostly just coordinating. Taking that over to feature adding was a mistake. Why not wait until after the release then? The motivation is that we are not allowing features in bugfix releases anymore. So it can't go in in 1.2.x if it isn't in 1.2.0. I also want to get several 1.2.x releases out. That means the earliest we could get it in is 1.3.0. But I would prefer not having to require recompiling extension code with every minor release. Sorry. This was handled poorly. But I think this would still be very useful and I would like to see it get in. We were planning on releasing a 1.2.0b3 early next week. But this is it, I promise. How about we work on it and see where we are early next week. If it doesn't look good, we can pull it. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
Can we fix the ticket notification mailings some day? It's been almost four months now. That would be fabulous. So far nobody has figured out how... Jarrod?? Re: the patch. I noticed the replacement of the signed type int by an unsigned size_t. Where did you notice this? I didn't see it. python or numpy types. The use of inline and the local declaration of variables would also have been caught early in a code review. What do you mean by the local declaration of variables? -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Aug 14, 2008, at 11:07 PM, Alan G Isaac wrote: Btw, numpy loads noticeably faster. Any chance of someone reviewing my suggestions for making the import somewhat faster still? http://scipy.org/scipy/numpy/ticket/874 Andrew [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
Travis E. Oliphant wrote: Can we fix the ticket notification mailings some day? It's been almost four months now. That would be fabulous. So far nobody has figured out how... Jarrod?? Re: the patch. I noticed the replacement of the signed type int by an unsigned size_t. Where did you notice this? I didn't see it. Are you referring to Stefan's patch to the Fu's _parse_signature code in r5654.This is a local function, I'm not sure why there is a concern. python or numpy types. The use of inline and the local declaration of variables would also have been caught early in a code review. What do you mean by the local declaration of variables? Never mind, I understand it's the mid-code declaration of variables (without a separate block defined) that Stefan fixed. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
On Fri, Aug 15, 2008 at 12:35 AM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Travis E. Oliphant wrote: Can we fix the ticket notification mailings some day? It's been almost four months now. That would be fabulous. So far nobody has figured out how... Jarrod?? Re: the patch. I noticed the replacement of the signed type int by an unsigned size_t. Where did you notice this? I didn't see it. Are you referring to Stefan's patch to the Fu's _parse_signature code in r5654.This is a local function, I'm not sure why there is a concern. There probably isn't a problem, but the use of unsigned types in loop counters and such can lead to subtle errors, so when a signed type is changed to an unsigned type the code has to be audited to make sure there won't be any unintended consequences. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
On Aug 15, 2008, at 8:36 AM, Charles R Harris wrote: The inline keyword also tends to be gcc/icc specific, although it is part of the C99 standard. For reference, a page on using inline and doing so portably: http://www.greenend.org.uk/rjk/2003/03/inline.html Andrew [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
On Fri, Aug 15, 2008 at 12:45 AM, Andrew Dalke [EMAIL PROTECTED]wrote: On Aug 15, 2008, at 8:36 AM, Charles R Harris wrote: The inline keyword also tends to be gcc/icc specific, although it is part of the C99 standard. For reference, a page on using inline and doing so portably: http://www.greenend.org.uk/rjk/2003/03/inline.html Doesn't do the trick for compilers that aren't C99 compliant. And there are many of them. For gcc there are other options. -finline-functionsIntegrate all simple functions into their callers. The compiler heuristically decides which functions are simple enough to be worth integrating in this way. If all calls to a given function are integrated, and the function is declared static, then the function is normally not output as assembler code in its own right. Enabled at level -O3. -finline-functions-called-onceConsider all static functions called once for inlining into their caller even if they are not marked inline. If a call to a given function is integrated, then the function is not output as assembler code in its own right. Enabled if -funit-at-a-time is enabled. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
On Thu, Aug 14, 2008 at 9:05 PM, Charles R Harris [EMAIL PROTECTED] wrote: Can we fix the ticket notification mailings some day? It's been almost four months now. It should work now. Let me know if you aren't getting them now. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
On Fri, Aug 15, 2008 at 1:04 AM, Jarrod Millman [EMAIL PROTECTED]wrote: On Thu, Aug 14, 2008 at 9:05 PM, Charles R Harris [EMAIL PROTECTED] wrote: Can we fix the ticket notification mailings some day? It's been almost four months now. It should work now. Let me know if you aren't getting them now. Thanks, it seems to be working now. What did you do? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Fri, Aug 15, 2008 at 02:59, Stéfan van der Walt [EMAIL PROTECTED] wrote: 2008/8/15 Robert Kern [EMAIL PROTECTED]: The devil is in the details. What exactly do you propose? When we discussed this last time, the participants more or less agreed that environment variables could cause more fragility than they're worth. It also breaks the first time you try to import a numpy-using library that was not written with this in mind. Basically, you're stuck with only code that you've written. First, I propose that I write some code. Second, I do not suggest the behaviour above, but: 1) Expose a new interface to numpy, called numpy.api 2) If a certain environment variable is set, the numpy namespace is not populated, and numpy.api becomes instantaneous to load. Even if the user forgets to set the variable, everything works as planned. If the user is aware of the variable, he won't be using numpy the normal way, so the fact that numpy.* is not available won't matter. I'm afraid that I still don't understand. Please expand on the following four cases (let's call the environment variable NUMPY_FAST_IMPORT): 1) NUMPY_FAST_IMPORT=0 (or simply absent) import numpy print dir(numpy) 2) NUMPY_FAST_IMPORT=0 import numpy.api print dir(numpy.api) 3) NUMPY_FAST_IMPORT=1 import numpy print dir(numpy) 4) NUMPY_FAST_IMPORT=1 import numpy.api print dir(numpy.api) -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
2008/8/15 Robert Kern [EMAIL PROTECTED]: I'm afraid that I still don't understand. Please expand on the Sorry, it's late. My explanation is probably not too lucid. The variable should rather read something like NUMPY_VIA_API, but here goes. 1) NUMPY_FAST_IMPORT=0 (or simply absent) import numpy print dir(numpy) Full numpy import, exactly as it is now. 2) NUMPY_FAST_IMPORT=0 import numpy.api print dir(numpy.api) Numpy.*, exactly as it is now. numpy.api provides a more nested API to NumPy. Import time is the same as current NumPy import. 3) NUMPY_FAST_IMPORT=1 import numpy print dir(numpy) 4) NUMPY_FAST_IMPORT=1 import numpy.api print dir(numpy.api) numpy.* is now probably close to empty. numpy.api is accessible as before. Import time for numpy.api is now super snappy since numpy.* is not populated. If this is not clear, then I need to sleep and implement a proof of concept before I try to explain further. Cheers Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
Jarrod Millman wrote: Hey, NumPy 1.2.0b2 is now available. Please test this so that we can uncover any problems ASAP. Windows binary: http://www.enthought.com/~gvaroquaux/numpy-1.2.0b2-win32.zip Hello Again, It seems the new release breaks matplotlib, for those pauvres who are using pre-compiled at least. If this means all C-modules compiled against numpy have to be recompiled, then this will make me very unhappy. -Jon H:\c:\python25\python -c import matplotlib; print matplotlib.__version__; import matplotlib.pylab 0.98.3 RuntimeError: module compiled against version 109 of C-API but this version of numpy is 10a Traceback (most recent call last): File string, line 1, in module File C:\Python25\Lib\site-packages\matplotlib\pylab.py, line 206, in module from matplotlib import mpl # pulls in most modules File C:\Python25\Lib\site-packages\matplotlib\mpl.py, line 1, in module from matplotlib import artist File C:\Python25\Lib\site-packages\matplotlib\artist.py, line 4, in module from transforms import Bbox, IdentityTransform, TransformedBbox, TransformedPath File C:\Python25\Lib\site-packages\matplotlib\transforms.py, line 34, in module from matplotlib._path import affine_transform ImportError: numpy.core.multiarray failed to import ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] min() of array containing NaN
If you're willing to do arithmetic you might even be able to pull it off, since NaNs tend to propagate: if (newmin) min -= (min-new); Whether the speed of this is worth its impenetrability I couldn't say. Code comments cure impenetrability, and have no cost in speed. One could write a paragraph explaining it (if it really needed that much). The comments could even reference the current discussion. --jh-- ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Generalized ufuncs?
On Fri, Aug 15, 2008 at 1:16 AM, Travis E. Oliphant [EMAIL PROTECTED] wrote: The biggest reason is that the patch requires changing the C-API and we are already doing that for 1.2. I would rather not do it again for another 6 months at least. I don't think we should make the patch wait that long. I understand the concern, but that should have been discussed I think. Changing C code affects most process wrt releases, not just people concerned with API stability. From my POV, recent C code caused me a lot of trouble wrt binaries building. If we keep changing C code during the beta, I won't be able to follow. The problem I see with any C (not necessarily C API) change is that they can break a lot of things. For example, I did not notice, but several generated code (umath stuff, mtrand) break Visual Studio compilation because of too long strings. If we accept changes in the C code during the beta phase, it just does not mean much to have beta. The point to have a time-based release is to enforce this kind of things; if we don't, then not only time-based release do not make sense, but they make those problem even worse (no benefit, and we rush things out which do not work). I see a lot of bugs in scipy/numpy, matplotlib problems in the last few days report on the ML and trac. Putting non bug fix-related C code will make this an ever-ending battle. cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different results from repeated calculation, part 2
Keith Goodman wrote: On Thu, Aug 14, 2008 at 11:29 AM, Bruce Southey [EMAIL PROTECTED] wrote: Keith Goodman wrote: I get slightly different results when I repeat a calculation. I've seen this problem before (it went away but has returned): http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025724.html A unit test is attached. It contains three tests: In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes. In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse). test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes. I get: == FAIL: repeatability #2 -- Traceback (most recent call last): File /home/[snip]/test/repeat_test.py, line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16 -- Should a unit test like this be added to numpy? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion Hi, In the function 'test_repeat_2' you are redefining variables 'x and y' that were first defined using the setup function. (Also, you are not using the __init__ function.) I vaguely recall there are some quirks to Python classes with this, so does the problem go away with if you use 'a,b' instead of 'x, y'? (I suspect the answer is yes given test_repeat_3). Note that you should also test that 'x' and 'y' are same here as well (but these have been redefined...). Otherwise, can you please provide your OS (version), computer processor, Python version, numpy version, version of atlas (or similar) and compiler used? I went back and reread the thread but I could not see this information. Here's a test that doesn't use classes and checks that x and y do not change: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070127/52b3a51c/attachment.py I'm using binaries from Debian Lenny: $ uname -a Linux jan 2.6.25-2-686 #1 SMP Fri Jul 18 17:46:56 UTC 2008 i686 GNU/Linux $ python -V Python 2.5.2 numpy.__version__ '1.1.0' $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family: 6 model : 15 model name: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping : 6 cpu MHz : 2402.004 cache size: 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp: yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm bogomips : 4807.45 clflush size : 64 processor : 1 vendor_id : GenuineIntel cpu family: 6 model : 15 model name: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping : 6 cpu MHz : 2402.004 cache size: 4096 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp: yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm bogomips : 4750.69 clflush size : 64 ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion I do not get this on my Intel Quad core2 Linux x64 system with a x86_64 running Fedora 10 supplied Python. I do compile my own versions of NumPy and currently don't use or really plan to use altas. But I know that you previously indicated that this was atlas related (http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025750.html) . From Intel's website, the Intel Core2 Duo E6600 (http://processorfinder.intel.com/details.aspx?sSpec=SL9S8) supports EM64T so it is x86 64-bit processor. I do not know
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Fri, Aug 15, 2008 at 02:59:43AM -0500, Stéfan van der Walt wrote: 2008/8/15 Robert Kern [EMAIL PROTECTED]: The devil is in the details. What exactly do you propose? When we discussed this last time, the participants more or less agreed that environment variables could cause more fragility than they're worth. It also breaks the first time you try to import a numpy-using library that was not written with this in mind. Basically, you're stuck with only code that you've written. First, I propose that I write some code. Second, I do not suggest the behaviour above, but: 1) Expose a new interface to numpy, called numpy.api 2) If a certain environment variable is set, the numpy namespace is not populated, and numpy.api becomes instantaneous to load. That doesn't work because of a feature in Python's import: when loading foo.bar, Python loads foo.__init__ first. This is why we have api modules all over ETS. Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
Fri, 15 Aug 2008 15:30:20 +0200, Gael Varoquaux wrote: On Fri, Aug 15, 2008 at 02:59:43AM -0500, Stéfan van der Walt wrote: [clip] 1) Expose a new interface to numpy, called numpy.api 2) If a certain environment variable is set, the numpy namespace is not populated, and numpy.api becomes instantaneous to load. That doesn't work because of a feature in Python's import: when loading foo.bar, Python loads foo.__init__ first. This is why we have api modules all over ETS. I think you can still do something evil, like this: import os if os.environ.get('NUMPY_VIA_API', '0') != '0': from numpy.lib.fromnumeric import * ... But I'm not sure how many milliseconds must be gained to justify this... -- Pauli Virtanen ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] patch for new mgrid / ogrid functionality
Hi, A while back, I sent some changes to index_tricks.py that would allow mgrid and ogrid to mesh things other than slices. For example: mgrid[['a','b'],[float,int],:3] [array([[['a', 'a', 'a'], ['a', 'a', 'a']], [['b', 'b', 'b'], ['b', 'b', 'b']]], dtype='|S1'), array([[[type 'float', type 'float', type 'float'], [type 'int', type 'int', type 'int']], [[type 'float', type 'float', type 'float'], [type 'int', type 'int', type 'int']]], dtype=object), array([[[0, 1, 2], [0, 1, 2]], [[0, 1, 2], [0, 1, 2]]])] At the time, there wasn't much follow-up, but I am hoping that there is still interest in this functionality, as I have gone ahead and finished the patch including documentation changes and updates to test_index_tricks.py. Attached is a patch set to the latest subversion of the numpy trunk. I don't think I am allowed to commit the changes myself - correct me if I am wrong. This functionality seems like a nice addition to me as it allows one to mesh things that are not uniformly spaced and potentially not even numbers. The changes don't affect functionality that existed previously except for one minor change - instead of returning a numpy array of arrays, mgrid/ogrid now return a list of arrays. However, this is unlikely to be a problem as the majority of users generally unpack the results of mgrid/ogrid so that each matrix can be used individually. Comments welcome. Cheers, David -- ** David M. Kaplan Charge de Recherche 1 Institut de Recherche pour le Developpement Centre de Recherche Halieutique Mediterraneenne et Tropicale av. Jean Monnet B.P. 171 34203 Sete cedex France Phone: +33 (0)4 99 57 32 27 Fax: +33 (0)4 99 57 32 95 http://www.ur097.ird.fr/team/dkaplan/index.html ** Index: numpy/lib/tests/test_index_tricks.py === --- numpy/lib/tests/test_index_tricks.py (revision 5654) +++ numpy/lib/tests/test_index_tricks.py (working copy) @@ -24,15 +24,21 @@ def test_nd(self): c = mgrid[-1:1:10j,-2:2:10j] d = mgrid[-1:1:0.1,-2:2:0.2] -assert(c.shape == (2,10,10)) -assert(d.shape == (2,20,20)) +assert(array(c).shape == (2,10,10)) +assert(array(d).shape == (2,20,20)) assert_array_equal(c[0][0,:],-ones(10,'d')) assert_array_equal(c[1][:,0],-2*ones(10,'d')) assert_array_almost_equal(c[0][-1,:],ones(10,'d'),11) assert_array_almost_equal(c[1][:,-1],2*ones(10,'d'),11) -assert_array_almost_equal(d[0,1,:]-d[0,0,:], 0.1*ones(20,'d'),11) -assert_array_almost_equal(d[1,:,1]-d[1,:,0], 0.2*ones(20,'d'),11) +assert_array_almost_equal(d[0][1,:]-d[0][0,:], 0.1*ones(20,'d'),11) +assert_array_almost_equal(d[1][:,1]-d[1][:,0], 0.2*ones(20,'d'),11) +def test_listargs(self): +e = mgrid[ :2, ['a', 'b', 'c'], [1,5,50,500] ] +assert( array(e).shape == (3,2,3,4) ) +assert_array_equal( e[0][:,1,1].ravel(), r_[:2] ) +assert_array_equal( e[1][1,:,1].ravel(), array(['a','b','c']) ) +assert_array_equal( e[2][1,1,:].ravel(), array([1,5,50,500]) ) class TestConcatenator(TestCase): def test_1d(self): Index: numpy/lib/index_tricks.py === --- numpy/lib/index_tricks.py (revision 5654) +++ numpy/lib/index_tricks.py (working copy) @@ -11,7 +11,7 @@ from numpy.core.numerictypes import find_common_type import math -import function_base +import function_base, shape_base import numpy.core.defmatrix as matrix makemat = matrix.matrix @@ -118,6 +118,10 @@ number of points to create between the start and stop values, where the stop value **is inclusive**. +One can also use lists or arrays as indexing arguments, in which case +these will be meshed out themselves instead of generating matrices from +the slice arguments. See examples below. + If instantiated with an argument of sparse=True, the mesh-grid is open (or not fleshed out) so that only one-dimension of each returned argument is greater than 1 @@ -126,19 +130,38 @@ mgrid = np.lib.index_tricks.nd_grid() mgrid[0:5,0:5] -array([[[0, 0, 0, 0, 0], -[1, 1, 1, 1, 1], -[2, 2, 2, 2, 2], -[3, 3, 3, 3, 3], -[4, 4, 4, 4, 4]], -BLANKLINE - [[0, 1, 2, 3, 4], -[0, 1, 2, 3, 4], -[0, 1, 2, 3, 4], -[0, 1, 2, 3, 4], -[0, 1, 2, 3, 4]]]) +[array([[0, 0, 0, 0, 0], + [1, 1, 1, 1, 1], + [2, 2, 2, 2, 2], + [3, 3, 3, 3, 3], + [4, 4, 4, 4, 4]]), array([[0, 1, 2, 3, 4], + [0, 1, 2, 3, 4], + [0, 1, 2, 3, 4], + [0, 1, 2, 3, 4], + [0, 1, 2, 3, 4]])] mgrid[-1:1:5j] array([-1. , -0.5, 0. ,
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Thu, Aug 14, 2008 at 3:58 PM, Alan G Isaac [EMAIL PROTECTED] wrote: Two odd failures in test_print.py. Platform: Win XP SP3 on Intel T2600. Alan Isaac I got the fixes to make numpy buildable again with VS 2003, and the errors are mingw specific. Either a compiler bug or more likely a configuration problem (mingw and vs not using the same codepath somewhere). At least, now, I can compare the two and it should not take too much time to sorting that out. cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Fri, Aug 15, 2008 at 4:21 AM, Jon Wright [EMAIL PROTECTED] wrote: It seems the new release breaks matplotlib, for those pauvres who are using pre-compiled at least. If this means all C-modules compiled against numpy have to be recompiled, then this will make me very unhappy. Yes, the new release requires a recompile. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Aug 15, 2008, at 4:38 PM, Pauli Virtanen wrote: I think you can still do something evil, like this: import os if os.environ.get('NUMPY_VIA_API', '0') != '0': from numpy.lib.fromnumeric import * ... But I'm not sure how many milliseconds must be gained to justify this... I don't think it's enough. I don't like environmental variable tricks like that. My tests suggest: current SVN: 0.12 seconds my patch: 0.10 seconds removing some top-level imports: 0.09 seconds my patch and removing some additional top-level imports: 0.08 seconds (this is a guess) First, I reverted my patch, so my import times went from 0.10 second to 0.12 seconds. Second, I commented out the pure module imports from numpy/__init__.py import linalg import fft import random import ctypeslib import ma import doc The import time went to 0.089. Note that my patch also gets rid of import doc and import ctypeslib, which take up a good chunk of time. The fft, linalg, and random libraries take 0.002 seconds each, and ma takes 0.007. Not doing these imports makes code about 0.01 second faster than my patches, which shaved off 0.02 seconds. That 0.01 second comes from not importing the fft, linalg, and ma modules. My patch does improve things in a few other places, so perhaps those other places adds another 0.01 seconds of performance. Why can't things be better? Take a look at the slowest imports. (Note, times are inclusive of the children) == Slowest (including children) == 0.089 numpy (None) 0.085 add_newdocs (numpy) 0.079 lib (add_newdocs) 0.041 type_check (lib) 0.040 numpy.core.numeric (type_check) 0.015 _internal (numpy.core.numeric) 0.014 numpy.testing (lib) 0.014 re (_internal) 0.010 unittest (numpy.testing) 0.010 numeric (numpy.core.numeric) 0.009 io (lib) Most of the time is spent importing 'lib'. Can that be made quicker? Not easily. lib is first imported in add_newdocs. Personally, I want to get rid of add_newdocs and move the docstrings into the correct locations. Stubbing the function out by adding def add_newdoc(*args): pass to the tops of add_newdocs.py saves 0.005 seconds, but if you try it out and remove the import lib from add_newdocs.py then you'll have to fix a cyclical dependency. numpy/__init__.py: import core numpy/core/__init__.py: from defmatrix import * numpy/core/defmatrix.py: from numpy.lib.utils import issubdtype numpy/lib/__init__.py: from type_check import * numpy/lib/type_check.py: import numpy.core.numeric as _nx AttributeError: 'module' object has no attribute 'core' The only way out of the loop is to have numpy/__init__.py import lib before importing core. It's possible to clean up the code so this loop doesn't exist, and fix things so that fewer things are imported when some environment variable is set, but it doesn't look easy. Modules depend on other modules a bit too much to make me happy. Andrew [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] VS 2003 problems with cython-generated code
Hi, I noticed this morning that numpy 1.2 is not buildable with VS 2003 (the one you have to use for official python releases for at least python 2.5, and maybe 2.4). When we generate C code, both with internal code (numpy/core/code_generator) and with external tools (cython/pyrex for mtrand), the string literals generated for docstrings are too long for visual studio. We have to break them (e.g. foo bar becomes foobar), but doing so with cython-generated code is only doable by changing cython itself. So I did patch cython to break those, and regenerate the mtrand.c. This is done in the vs_longstring branch. Is it ok to put this for 1.2 ? Without it, I don't see a way to have numpy 1.2 buildable with VS. cheers, David P.S: I attached the necessary patches to cython bug tracker too, so that hopefully, the problem can be solved for a future version of cython. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
Jarrod Millman wrote: NumPy 1.2.0b2 is now available. Please test this so that we can uncover any problems ASAP. Mac binary: https://cirl.berkeley.edu/numpy/numpy-1.2.0b2-py2.5-macosx10.5.dmg Ran 1715 tests in 12.671s OK (SKIP=1) OS-X 10.4.11 Dual G5 PPC Python version 2.5.2 (r252:60911, Feb 22 2008, 07:57:53) [GCC 4.0.1 It also seems to work so far with my code... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Fri, Aug 15, 2008 at 10:41 AM, Andrew Dalke [EMAIL PROTECTED]wrote: On Aug 15, 2008, at 4:38 PM, Pauli Virtanen wrote: I think you can still do something evil, like this: import os if os.environ.get('NUMPY_VIA_API', '0') != '0': from numpy.lib.fromnumeric import * ... But I'm not sure how many milliseconds must be gained to justify this... I don't think it's enough. I don't like environmental variable tricks like that. My tests suggest: current SVN: 0.12 seconds my patch: 0.10 seconds removing some top-level imports: 0.09 seconds my patch and removing some additional top-level imports: 0.08 seconds (this is a guess) First, I reverted my patch, so my import times went from 0.10 second to 0.12 seconds. Second, I commented out the pure module imports from numpy/__init__.py import linalg import fft import random import ctypeslib import ma import doc The import time went to 0.089. Note that my patch also gets rid of import doc and import ctypeslib, which take up a good chunk of time. The fft, linalg, and random libraries take 0.002 seconds each, and ma takes 0.007. Not doing these imports makes code about 0.01 second faster than my patches, which shaved off 0.02 seconds. That 0.01 second comes from not importing the fft, linalg, and ma modules. My patch does improve things in a few other places, so perhaps those other places adds another 0.01 seconds of performance. Why can't things be better? Take a look at the slowest imports. (Note, times are inclusive of the children) == Slowest (including children) == 0.089 numpy (None) 0.085 add_newdocs (numpy) 0.079 lib (add_newdocs) 0.041 type_check (lib) 0.040 numpy.core.numeric (type_check) 0.015 _internal (numpy.core.numeric) 0.014 numpy.testing (lib) 0.014 re (_internal) 0.010 unittest (numpy.testing) 0.010 numeric (numpy.core.numeric) 0.009 io (lib) Most of the time is spent importing 'lib'. Can that be made quicker? Not easily. lib is first imported in add_newdocs. Personally, I want to get rid of add_newdocs and move the docstrings into the correct locations. And those would be? I hope you aren't thinking of moving them into the C code. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
I forgot to mention.. On Aug 15, 2008, at 9:00 AM, Travis E. Oliphant wrote: 1) Removing ctypeslib import * Can break code if somebody has been doing import numpy and then using numpy.ctypeslib * I'm fine with people needing to import numpy.ctypeslib to use the capability as long as we clearly indicate these breakages. You were the one who had numpy/__init__.py always import ctypeslib r3027 | oliphant | 2006-08-15 11:53:49 +0200 (Tue, 15 Aug 2006) | 1 line import ctypeslib on numpy load and change name from ctypes_load_library to load_library Was there a driving reason for that other than decreased user burden? There will be breakage in the wild. I found: http://mail.python.org/pipermail/python-list/2007-December/469132.html http://www.scipy.org/Cookbook/Ctypes and a Google Code search found a couple hits too: http://www.google.com/codesearch?q=numpy +ctypeslibhl=enbtnG=Search+Code It doesn't looks like there will be a big impact. This is not a widely used package (in public code), and many examples seem to prefer this form: from numpy.ctypeslib import ndpointer, load_library Andrew [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] VS 2003 problems with cython-generated code
On Fri, Aug 15, 2008 at 10:49 AM, David Cournapeau [EMAIL PROTECTED]wrote: Hi, I noticed this morning that numpy 1.2 is not buildable with VS 2003 (the one you have to use for official python releases for at least python 2.5, and maybe 2.4). When we generate C code, both with internal code (numpy/core/code_generator) and with external tools (cython/pyrex for mtrand), the string literals generated for docstrings are too long for visual studio. We have to break them (e.g. foo bar becomes foobar), but doing so with cython-generated code is only doable by changing cython itself. So I did patch cython to break those, and regenerate the mtrand.c. This is done in the vs_longstring branch. Is it ok to put this for 1.2 ? Without it, I don't see a way to have numpy 1.2 buildable with VS. Be careful if you break across lines. The gnu compilers will accept foo bar But for some others you need to use a line continuation. foo\ bar Is this mostly for the ufuncs? I'm not sure why we can't make that operate like add_newdocs. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
Andrew Dalke wrote: I forgot to mention.. On Aug 15, 2008, at 9:00 AM, Travis E. Oliphant wrote: 1) Removing ctypeslib import * Can break code if somebody has been doing import numpy and then using numpy.ctypeslib * I'm fine with people needing to import numpy.ctypeslib to use the capability as long as we clearly indicate these breakages. You were the one who had numpy/__init__.py always import ctypeslib r3027 | oliphant | 2006-08-15 11:53:49 +0200 (Tue, 15 Aug 2006) | 1 line import ctypeslib on numpy load and change name from ctypes_load_library to load_library Was there a driving reason for that other than decreased user burden? Not that I can recall. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
2008/8/15 Andrew Dalke [EMAIL PROTECTED]: I don't think it's enough. I don't like environmental variable tricks like that. My tests suggest: current SVN: 0.12 seconds my patch: 0.10 seconds removing some top-level imports: 0.09 seconds my patch and removing some additional top-level imports: 0.08 seconds (this is a guess) There are two different concerns being addressed here: ease of use and load time. I am not sure the two can be optimised simultaneously. On the other hand, if we had two public APIs for importing (similar to matplotlib's pylab vs. pyplot), we could satisfy both parties, without placing too much of a burden upon developers. Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] VS 2003 problems with cython-generated code
On Fri, Aug 15, 2008 at 12:18 PM, Charles R Harris [EMAIL PROTECTED] wrote: Be careful if you break across lines. The gnu compilers will accept foo bar But for some others you need to use a line continuation. foo\ bar I don't put newlines: I really do foobar, to avoid this exact problem. I tested the changes with gcc and visual studio (that's really the minimal set we want to support at any release). The changes are done in the code generator, because that's where the problem was, but maybe this could have been changed somewhere else. For mtrand, that's different, though: I am the only one who can generate the file right now: that's the object of my email, and the reason why I created a new branch for this. cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Fri, Aug 15, 2008 at 11:41 AM, Andrew Dalke [EMAIL PROTECTED] wrote: It's possible to clean up the code so this loop doesn't exist, and fix things so that fewer things are imported when some environment variable is set, but it doesn't look easy. Modules depend on other modules a bit too much to make me happy. Yes. numpy.core should not depend on anything else. That would be the easy thing to do: there is only one function used IIRC from numpy.lib. As you said, the hairy stuff (from import dependency POV) is in numpy.lib. cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Addition of a dict object to all NumPy objects
Hello all, While we are on the subject of C-API changes, I've noticed that quite a few of the sub-classes of ndarray are constructed to basically add meta-information to the array. What if the base-class ndarray grew a dict object at it's end to hold meta information. Naturally, several questions arise: 1) How would you access the dictionary? (i.e. __dict__?) 2) Would attribute setting and getting retrieve from this dictionary (how are conflicts managed). * I think I would prefer a dict attribute on the numpy array that gets and sets into the dictionary. 3) Are the additional 4-8 bytes too expensive 4) Should there instead be a C-level array sub-class with the dict attached. 5) How should dict's be propagated through views? My preference is to not propagate them at all. Thoughts? -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Addition of a dict object to all NumPy objects
On Fri, Aug 15, 2008 at 1:58 PM, Travis E. Oliphant [EMAIL PROTECTED]wrote: Hello all, While we are on the subject of C-API changes, I've noticed that quite a few of the sub-classes of ndarray are constructed to basically add meta-information to the array. What if the base-class ndarray grew a dict object at it's end to hold meta information. Naturally, several questions arise: 1) How would you access the dictionary? (i.e. __dict__?) 2) Would attribute setting and getting retrieve from this dictionary (how are conflicts managed). * I think I would prefer a dict attribute on the numpy array that gets and sets into the dictionary. 3) Are the additional 4-8 bytes too expensive One of the problems with numarray was the time taken to allocate small arrays. Would adding a dictionary slow down the allocation of numpy arrays? Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Addition of a dict object to all NumPy objects
On Fri, Aug 15, 2008 at 2:10 PM, Charles R Harris [EMAIL PROTECTED] wrote: On Fri, Aug 15, 2008 at 1:58 PM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Hello all, While we are on the subject of C-API changes, I've noticed that quite a few of the sub-classes of ndarray are constructed to basically add meta-information to the array. What if the base-class ndarray grew a dict object at it's end to hold meta information. Naturally, several questions arise: 1) How would you access the dictionary? (i.e. __dict__?) 2) Would attribute setting and getting retrieve from this dictionary (how are conflicts managed). * I think I would prefer a dict attribute on the numpy array that gets and sets into the dictionary. 3) Are the additional 4-8 bytes too expensive One of the problems with numarray was the time taken to allocate small arrays. Would adding a dictionary slow down the allocation of numpy arrays? That said, I think we should keep things as simple and orthogonal as possible. If we go this way, I think a subclass with a dictionary would be the best approach to avoid the heartbreak of creeping featuritis. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Addition of a dict object to all NumPy objects
On Fri, Aug 15, 2008 at 15:15, Charles R Harris [EMAIL PROTECTED] wrote: On Fri, Aug 15, 2008 at 2:10 PM, Charles R Harris [EMAIL PROTECTED] wrote: On Fri, Aug 15, 2008 at 1:58 PM, Travis E. Oliphant [EMAIL PROTECTED] wrote: Hello all, While we are on the subject of C-API changes, I've noticed that quite a few of the sub-classes of ndarray are constructed to basically add meta-information to the array. What if the base-class ndarray grew a dict object at it's end to hold meta information. Naturally, several questions arise: 1) How would you access the dictionary? (i.e. __dict__?) 2) Would attribute setting and getting retrieve from this dictionary (how are conflicts managed). * I think I would prefer a dict attribute on the numpy array that gets and sets into the dictionary. 3) Are the additional 4-8 bytes too expensive One of the problems with numarray was the time taken to allocate small arrays. Would adding a dictionary slow down the allocation of numpy arrays? That said, I think we should keep things as simple and orthogonal as possible. If we go this way, I think a subclass with a dictionary would be the best approach to avoid the heartbreak of creeping featuritis. The point of the feature is to avoid subclasses. There are a number of use cases for annotating arrays with metadata. Currently, they are forced to use subclasses. Every time you use ndarray subclasses, you are essentially forcing yourself into your subclass's ghetto of functions that only work on your subclass. I think you could make the dictionary created lazily on the first getattr(). -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Addition of a dict object to all NumPy objects
Robert Kern wrote: 3) Are the additional 4-8 bytes too expensive One of the problems with numarray was the time taken to allocate small arrays. Would adding a dictionary slow down the allocation of numpy arrays? No, I don't think so, not if we did nothing by default but set the dict to NULL (i.e. no propagation of meta-information to new arrays). That said, I think we should keep things as simple and orthogonal as possible. If we go this way, I think a subclass with a dictionary would be the best approach to avoid the heartbreak of creeping featuritis. The point of the feature is to avoid subclasses. There are a number of use cases for annotating arrays with metadata. Currently, they are forced to use subclasses. Every time you use ndarray subclasses, you are essentially forcing yourself into your subclass's ghetto of functions that only work on your subclass. This would be one-step better in the sense that there would be a single sub-class to handle all cases of just needing meta information.But, I tend to agree that adding the dictionary to all arrays is preferable. I think you could make the dictionary created lazily on the first getattr(). Yes, that could easily be done. It would just be set to NULL on creation and the penalty/overhead would be the extra pointer in the array structure. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
Andrew Dalke wrote: Can that be made quicker? Not easily. lib is first imported in add_newdocs. Personally, I want to get rid of add_newdocs and move the docstrings into the correct locations. Where would that be, in the C-code? The reason for add_newdocs is to avoid writing docstrings in C-code which is a pain. It's possible to clean up the code so this loop doesn't exist, and fix things so that fewer things are imported when some environment variable is set, but it doesn't look easy. Modules depend on other modules a bit too much to make me happy. I've removed this loop. Are there other places in numpy.core that depend on numpy.lib? Thanks for the very helpful analysis. -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Addition of a dict object to all NumPy objects
On Fri, Aug 15, 2008 at 3:12 PM, Travis E. Oliphant [EMAIL PROTECTED]wrote: Robert Kern wrote: 3) Are the additional 4-8 bytes too expensive One of the problems with numarray was the time taken to allocate small arrays. Would adding a dictionary slow down the allocation of numpy arrays? No, I don't think so, not if we did nothing by default but set the dict to NULL (i.e. no propagation of meta-information to new arrays). That said, I think we should keep things as simple and orthogonal as possible. If we go this way, I think a subclass with a dictionary would be the best approach to avoid the heartbreak of creeping featuritis. The point of the feature is to avoid subclasses. There are a number of use cases for annotating arrays with metadata. Currently, they are forced to use subclasses. Every time you use ndarray subclasses, you are essentially forcing yourself into your subclass's ghetto of functions that only work on your subclass. This would be one-step better in the sense that there would be a single sub-class to handle all cases of just needing meta information.But, I tend to agree that adding the dictionary to all arrays is preferable. I think you could make the dictionary created lazily on the first getattr(). Yes, that could easily be done. It would just be set to NULL on creation and the penalty/overhead would be the extra pointer in the array structure. That doesn't sound bad, and the convenience is probably worth it. Can this be done in a way that doesn't require a new compile? That is, can it look like a subclass in C? I'm opposed to adding anything until 1.2 is out. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] min() of array containing NaN
Availability of the NaN functionality in a method of ndarray The last point is key. The NaN behavior is central to analyzing real data containing unavoidable bad values, which is the bread and butter of a substantial fraction of the user base. In the languages they're switching from, handling NaNs is just part of doing business, and is an option of every relevant routine; there's no need for redundant sets of routines. In contrast, numpy appears to consider data analysis to be secondary, somehow, to pure math, and takes the NaN functionality out of routines like min() and std(). This means it's not possible to use many ndarray methods. If we're ready to handle a NaN by returning it, why not enable the more useful behavior of ignoring it, at user discretion? Maybe I missed this somewhere, but this seems like a better use for masked arrays, not NaN's. Masked arrays were specifically designed to add functions that work well with masked/invalid data points. Why reinvent the wheel here? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Addition of a dict object to all NumPy objects
Robert Kern wrote: I think you could make the dictionary created lazily on the first getattr(). In order to make it work you have to reserve space for a PyObject* pointer for the instance dict somewhere in your type definition. It's going to increase the size of every object by 4 bytes on a 32bit OS or 8 bytes on a 64bit OS, aka sizeof(uintptr_t). An empty dict increases the size of every object by ~30 byte. Christian ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Questions about some of the random functions
Just a question - I'm gradually working through the distributions for the documentation marathon and I realised that there is a whole nest of them named standard-. For several (e.g., normal) they are just the regular distribution with all the parameters except size set to standard values. So my first question is - why? They seem very redundant. Second question - why is there is a standard-t for Student's T-test (or the distribution associated with it) but no corresponding t distribution. I probably have too much time to think about it this weekend. 8-) -- --- | Alan K. Jackson| To see a World in a Grain of Sand | | [EMAIL PROTECTED] | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | --- ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] [REVIEW] Update NumPy API format to support updates that don't break binary compatibility
The current NumPy API number, stored as NPY_VERSION in the header files, needs to be incremented every time the NumPy C-API changes. The counter tells developers with exactly which revision of the API they are dealing. NumPy does some checking to make sure that it does not run against an old version of the API. Currently, we have no way of distinguishing between changes that break binary compatibility and those that don't. The proposed fix breaks the version number up into two counters -- one that gets increased when binary compatibility is broken, and another when the API is changed without breaking compatibility. Backward compatibility with packages such as Matplotlib is maintained by renaming NPY_VERSION to NPY_BINARY_VERSION. Please review the proposed change at http://codereview.appspot.com/2946 Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Questions about some of the random functions
On Fri, Aug 15, 2008 at 17:26, Alan Jackson [EMAIL PROTECTED] wrote: Just a question - I'm gradually working through the distributions for the documentation marathon and I realised that there is a whole nest of them named standard-. For several (e.g., normal) they are just the regular distribution with all the parameters except size set to standard values. Not quite. Rather the regular distributions are built up from the standard version by transformation. So my first question is - why? They seem very redundant. At the C level, they sometimes exist because they are components of other distributions that don't need the x*1.0 + 0.0 waste. At the Python level, they usually exist for backwards compatibility with the libraries I was replacing or because I thought they would be useful for Python-level implementations of some distributions in scipy.stats. Second question - why is there is a standard-t for Student's T-test (or the distribution associated with it) but no corresponding t distribution. Apathy, to be honest. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Addition of a dict object to all NumPy objects
On Fri, Aug 15, 2008 at 17:31, Christian Heimes [EMAIL PROTECTED] wrote: Robert Kern wrote: I think you could make the dictionary created lazily on the first getattr(). In order to make it work you have to reserve space for a PyObject* pointer for the instance dict somewhere in your type definition. It's going to increase the size of every object by 4 bytes on a 32bit OS or 8 bytes on a 64bit OS, aka sizeof(uintptr_t). An empty dict increases the size of every object by ~30 byte. Yes, we know that. The concern I was addressing was the time overhead for creating the new dict object every time an ndarray gets instantiated. Most of these dict objects would be unused, so we would be wasting a substantial amount of time. If you push off the creation of the dict to the first time the user accesses it, then we're not wasting any time. We do realize that space for the pointer must still be reserved. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] long double woes on win32
Hi, The test failures on windows with 1.2b2 are due to buggy long double behavior of mingw. My understanding is that on windows, long double is 8 bytes (that's the sizeof value returned by VS 2003), but mingw says 12 bytes. One solution would be forcing numpy to configure itself to handle long double as double. But doing this breaks the configuration in many ways, and this would require relatively important changes which is quite fragile in numpy ATM (the whole math function things in umathmodule.c.src). I don't see this happening for 1.2. Another one is to use VS 2003 for the binaries: but then we are back to the problem of non buildable numpy with VS 2003. The last one is to ignore this for 1.2 (long double was already broken before in numpy win32 binaries). What do people think ? cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Generalised ufuncs branch
Hi all, I have moved the generalised ufuncs functionality off to a branch: http://svn.scipy.org/svn/numpy/branches/gen_ufuncs Please try it out and give us your feedback. We shall also pound on it at the sprint during SciPy'08, and thereafter decide how and when to best integrate it into NumPy. Thanks to Wenjie Fu and Hans-Andreas Engel for taking the time to think this issue through and to submit such a high-quality patch. Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] long double woes on win32
On Fri, Aug 15, 2008 at 5:43 PM, David Cournapeau [EMAIL PROTECTED]wrote: Hi, The test failures on windows with 1.2b2 are due to buggy long double behavior of mingw. My understanding is that on windows, long double is 8 bytes (that's the sizeof value returned by VS 2003), but mingw says 12 bytes. Doesn't mingw use the MSVC library? IIRC, in MSVC long doubles and doubles are the same type, so this seems to be a bug in mingw. I suppose numpy could detect this situation and define the types to be identical, but if this is impractical, then perhaps the best thing to do is issue an error message. One solution would be forcing numpy to configure itself to handle long double as double. But doing this breaks the configuration in many ways, and this would require relatively important changes which is quite fragile in numpy ATM (the whole math function things in umathmodule.c.src). I don't see this happening for 1.2. Another one is to use VS 2003 for the binaries: but then we are back to the problem of non buildable numpy with VS 2003. The last one is to ignore this for 1.2 (long double was already broken before in numpy win32 binaries). What do people think ? There's isn't much you can do about long doubles while maintaining MSVC compatibility. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Addition of a dict object to all NumPy objects
Robert Kern wrote: Yes, we know that. The concern I was addressing was the time overhead for creating the new dict object every time an ndarray gets instantiated. Most of these dict objects would be unused, so we would be wasting a substantial amount of time. If you push off the creation of the dict to the first time the user accesses it, then we're not wasting any time. We do realize that space for the pointer must still be reserved. I'm sorry for pointing to the obvious. I *guess* it's possible to delay the creation even further. You don't have to create a dict until somebody assigns a new attribute to an instance. It'd require some more code and you'd have to trade memory efficiency for slightly slower access to the additional attributes. Please also note that CPython uses a freelist of unused dict instances. The default size of the dict free list is 80 elements. The allocation and deallocation of dicts is cheap if you can stay below the threshold. Christian ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different results from repeated calculation, part 2
Alok Singhal wrote: On 14/08/08: 10:20, Keith Goodman wrote: A unit test is attached. It contains three tests: In test1, I construct matrices x and y and then repeatedly calculate z = calc(x,y). The result z is the same every time. So this test passes. In test2, I construct matrices x and y each time before calculating z = calc(x,y). Sometimes z is slightly different. But the x's test to be equal and so do the y's. This test fails (on Debian Lenny, Core 2 Duo, with libatlas3gf-sse2 but not with libatlas3gf-sse). test3 is the same as test2 but I calculate z like this: z = calc(100*x,y) / (100 * 100). This test passes. I get: == FAIL: repeatability #2 -- Traceback (most recent call last): File /home/[snip]/test/repeat_test.py, line 73, in test_repeat_2 self.assert_(result, msg) AssertionError: Max difference = 2.04946e-16 Could this be because of how the calculations are done? If the floating point numbers are stored in the cpu registers, in this case (intel core duo), they are 80-bit values, whereas 'double' precision is 64-bits. Depending upon gcc's optimization settings, the amount of automatic variables, etc., it is entirely possible that the numbers are stored in registers only in some cases, and are in the RAM in other cases. Thus, in your tests, sometimes some numbers get stored in the cpu registers, making the calculations with those values different from the case if they were not stored in the registers. See The pitfalls of verifying floating-point computations at http://portal.acm.org/citation.cfm?doid=1353445.1353446 (or if that needs subscription, you can download the PDF from http://arxiv.org/abs/cs/0701192). The paper has a lot of examples of surprises like this. Quote: We shall discuss the following myths, among others: ... - Arithmetic operations are deterministic; that is, if I do z=x+y in two places in the same program and my program never touches x and y in the meantime, then the results should be the same. - A variant: If x 1 tests true at one point, then x 1 stays true later if I never modify x. ... -Alok yep! The code is buggy. Please **nerver** use == to test if two floating point numbers are equal. **nerver**. As explained by Alok, floating point computations are *not* the same as computations over the field of reals numbers. Intel 80bits registers versus 64bits representation, IEE754, sse or not sse or mmx : Fun with floating point arithmetic :) The same code with and without SSE could provide you with no equal results. Never use == on flaots?!? Well, you would have to use it in some corner cases but please remember that computers are only working on a small subset (finite) of natural numbers. http://docs.python.org/tut/node16.html is a simple part of the story. 80 bits register are the ugly part (but the fun one :)) abs(a-b)epsilon is the correct way to test that a and b are equal. Xavier ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C99 on windows
David Cournapeau wrote: The current trunk has 14 failures on windows (with mingw). 12 of them are related to C99 (see ticket 869). Can the people involved in recent changes to complex functions take a look at it ? I think this is high priority for 1.2.0 I'm asking just out of curiosity. Why is NumPy using C99 and what features of C99 are used? The Microsoft compilers aren't supporting C99 and they'll probably never will. I don't know if the Intel CC supports C99. Even GCC doesn't implement C99 to its full extend. Are you planing to restrict yourself to MinGW32? I'm not a NumPy developer but I'm a Python core developer. I've laid the foundation for the VS 2008 build system for 2.6 / 3.0. Marc Dickinson and I have put lots of work into mathematical, numerical and IEEE 754 fixes. The work was mostly based on the C99 specs but we used C89 code. That should explain my interest in the matter. :] Christian ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Aug 15, 2008, at 6:41 PM, Andrew Dalke wrote: I don't think it's enough. I don't like environmental variable tricks like that. My tests suggest: current SVN: 0.12 seconds my patch: 0.10 seconds removing some top-level imports: 0.09 seconds my patch and removing some additional top-level imports: 0.08 seconds (this is a guess) First, I reverted my patch, so my import times went from 0.10 second to 0.12 seconds. Turns out I didn't revert everything. As of the SVN version from 10 minutes ago, import numpy on my machine takes 0.18 seconds, not 0.12 seconds. My patch should cut the import time by about 30-40% more from what it is. On some machines. Your milage may vary :) In my issue report I said the import time was 0.15 seconds. Adding up the times I saved doesn't match up with my final value. So take my numbers as guidelines. For those curious, top cumulative import times from SVN are: 0.184 numpy (None) 0.103 add_newdocs (numpy) 0.097 lib (add_newdocs) 0.049 type_check (lib) 0.048 numpy.core.numeric (type_check) 0.028 io (lib) 0.022 ctypeslib (numpy) 0.022 ctypes (ctypeslib) 0.021 random (numpy) 0.021 mtrand (random) 0.019 _import_tools (numpy) 0.019 glob (_import_tools) 0.018 _datasource (io) 0.016 fnmatch (glob) 0.015 numpy.testing (numpy.core.numeric) 0.014 re (fnmatch) Andrew [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C99 on windows
On Fri, Aug 15, 2008 at 7:25 PM, Christian Heimes [EMAIL PROTECTED] wrote: David Cournapeau wrote: The current trunk has 14 failures on windows (with mingw). 12 of them are related to C99 (see ticket 869). Can the people involved in recent changes to complex functions take a look at it ? I think this is high priority for 1.2.0 I'm asking just out of curiosity. Why is NumPy using C99 and what features of C99 are used? The Microsoft compilers aren't supporting C99 and they'll probably never will. I don't know if the Intel CC supports C99. Even GCC doesn't implement C99 to its full extend. Are you planing to restrict yourself to MinGW32? I believe C99 was used as a guide to how complex corner cases involving +/-0, +/-inf, etc. should behave. However, it doesn't look possible to make that behaviour portable without a lot of work and it probably isn't worth the trouble. At the moment the failing tests have been removed. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
2008/8/15 Andrew Dalke [EMAIL PROTECTED]: On Aug 15, 2008, at 6:41 PM, Andrew Dalke wrote: I don't think it's enough. I don't like environmental variable tricks like that. My tests suggest: current SVN: 0.12 seconds my patch: 0.10 seconds removing some top-level imports: 0.09 seconds my patch and removing some additional top-level imports: 0.08 seconds (this is a guess) First, I reverted my patch, so my import times went from 0.10 second to 0.12 seconds. Turns out I didn't revert everything. As of the SVN version from 10 minutes ago, import numpy on my machine takes 0.18 seconds, not 0.12 seconds. My patch should cut the import time by about 30-40% more from what it is. On some machines. Your milage may vary :) I realize this is already a very complicated issue, but it's worth pointing out that the times you measure are not necessarily the times users care about. These numbers are once everything is loaded into disk cache. They don't reflect, say, interactive startup time, or time it takes in a script that uses substantial disk access (i.e. which fills the cache with something else). I realize this is the only available basis for comparison, but do keep in mind that improvements of a few milliseconds here may make a much larger difference in practice - or a much smaller difference. Anne ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Aug 15, 2008, at 11:18 PM, Travis E. Oliphant wrote: I've removed this loop. Are there other places in numpy.core that depend on numpy.lib? That fixed the loop I identified. I removed the import lib in add_newdocs.py and things imported fine. I then commented out the following lines #import lib #from lib import * in numpy/__init__.py . This identified a loop in fft. [josiah:~/src] dalke% python time_import.py Traceback (most recent call last): File time_import.py, line 31, in module import numpy File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/__init__.py, line 146, in module import fft File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/fft/__init__.py, line 38, in module from fftpack import * File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/fft/fftpack.py, line 541, in module from numpy import deprecate ImportError: cannot import name deprecate Removing the import fft gives another loop for deprecate: import numpy File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/__init__.py, line 148, in module import ctypeslib File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ctypeslib.py, line 56, in module from numpy import integer, ndarray, dtype as _dtype, deprecate, array ImportError: cannot import name deprecate Removing the import ctypeslib gives the following loop: Traceback (most recent call last): File time_import.py, line 31, in module import numpy File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/__init__.py, line 149, in module import ma File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/__init__.py, line 44, in module import core File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/ma/core.py, line 66, in module from numpy import ndarray, typecodes, amax, amin, iscomplexobj,\ ImportError: cannot import name iscomplexobj Removing the import ma and I ended up with no ImportErrors. The code still end up importing numpy.lib because of File /Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/numpy/linalg/linalg.py, line 28, in module from numpy.lib import triu Take that out and import numpy does not imply import nupy.lib Andrew [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.2.0b2 released
On Aug 15, 2008, at 11:18 PM, Travis E. Oliphant wrote: Where would that be, in the C-code? The reason for add_newdocs is to avoid writing docstrings in C-code which is a pain. That was my thought. I could see that the code might useful during module development, where you don't want text changes to incur a recompile hit. But come release time, if someone volunteers to migrate the docstrings to C, in order to get a small bit of performance increase, then I don't see why not. Andrew [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] C99 on windows
Charles R Harris wrote: I believe C99 was used as a guide to how complex corner cases involving +/-0, +/-inf, etc. should behave. However, it doesn't look possible to make that behaviour portable without a lot of work and it probably isn't worth the trouble. At the moment the failing tests have been removed. We used the C99 specs as guideline, too. If I recall correctly mostly Annex F and G. We got it all sorted out but it took us a tremendous amount of time and work. We had to reimplement a bunch of math functions like log1p and the reversed hyperbolic functions. Mark introduced a system of lookup tables for complex corner cases. You can find them at the end of the cmath module: http://svn.python.org/projects/python/trunk/Modules/cmathmodule.c The new pymath files contain a series of macros and re-implemenation of C99 features and cross platform workarounds: http://svn.python.org/projects/python/trunk/Include/pymath.h http://svn.python.org/projects/python/trunk/Python/pymath.c HTH Christian ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Different results from repeated calculation, part 2
On Aug 16, 2008, at 3:02 AM, Xavier Gnata wrote: abs(a-b)epsilon is the correct way to test that a and b are equal. But 1) the value of epsilon is algorithm dependent, 2) it may be that -epsilon1 a-b epsilon2 where the epsilons are not the same value, 3) the more valid test may be = instead of as when the maximum permissible difference is 0.5 ulp, and 4) the more important test (as mentioned in the Monniaux paper you referenced) may be the relative error and not the absolute error, which is how you wrote it. So saying that this is the correct way isn't that helpful. It requires proper numeric analysis and that isn't always available. An advantage to checking if a==b is that if they are in fact equal then there's no need to do any analysis. Whereas if you choose some epsilon (or epsilon-relative) then how do you pick that number? Andrew [EMAIL PROTECTED] ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion