[Numpy-discussion] ValueError: Unknown format code 'g' for object of type 'str'
numpy.__version__ '2.0.0.dev-10db259' == ERROR: Test the str.format method with NumPy scalar types -- Traceback (most recent call last): File /home/nwagner/local/lib64/python2.6/site-packages/nose-0.11.2.dev-py2.6.egg/nose/case.py, line 183, in runTest self.test(*self.arg) File /home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/decorators.py, line 146, in skipper_func return f(*args, **kwargs) File /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/tests/test_print.py, line 223, in test_scalar_format assert_equal(fmat.format(val), fmat.format(valtype(val)), ValueError: Unknown format code 'g' for object of type 'str' ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
Hi! I have had a look at the list of numpy.loadtxt tickets. I have never contributed to numpy before, so I may be doing stupid things - don't be afraid to let me know! My opinions are my own, and in detail, they are: 1752: I attach a possible patch. FWIW, I agree with the request. The patch is written to be compatible with the fix in ticket #1562, but I did not test that yet. 1731: This seems like a rather trivial feature enhancement. I attach a possible patch. 1616: The suggested patch seems reasonable to me, but I do not have a full list of what objects loadtxt supports today as opposed to what this patch will support. 1562: I attach a possible patch. This could also be the default behavior to my mind, since the function caller can simply call numpy.squeeze if needed. Changing default behavior would probably break old code, however. 1458: The fix suggested in the ticket seems reasonable, but I have never used record arrays, so I am not sure of this. 1445: Adding this functionality could break old code, as some old datafiles may have empty lines which are now simply ignored. I do not think the feature is a good idea. It could rather be implemented as a separate function. 1107: I do not see the need for this enhancement. In my eyes, the usecols kwarg does this and more. Perhaps I am misunderstanding something here. 1071: It is not clear to me whether loadtxt is supposed to support missing values in the fashion indicated in the ticket. 1163: 1565: These tickets seem to have the same origin of the problem. I attach one possible patch. The previously suggested patches that I've seen will not correctly convert floats to ints, which I believe my patch will. I hope you find this useful! Is there some way of submitting the patches for review in a more convenient fashion than e-mail? Cheers, Paul. 1562.patch Description: Binary data 1163.patch Description: Binary data 1731.patch Description: Binary data 1752.patch Description: Binary data On 25. mars 2011, at 16.06, Charles R Harris wrote: Hi All, Could someone with an interest in loadtxt/savetxt look through the associated tickets? A search on the tickets using either of those keys will return fairly lengthy lists. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
Hi, Thanks! On Sat, 26 Mar 2011 13:11:46 +0100, Paul Anton Letnes wrote: [clip] I hope you find this useful! Is there some way of submitting the patches for review in a more convenient fashion than e-mail? You can attach them on the trac to each ticket. That way they'll be easy to find later on. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
Hi, On 26 Mar 2011, at 14:36, Pauli Virtanen wrote: On Sat, 26 Mar 2011 13:11:46 +0100, Paul Anton Letnes wrote: [clip] I hope you find this useful! Is there some way of submitting the patches for review in a more convenient fashion than e-mail? You can attach them on the trac to each ticket. That way they'll be easy to find later on. I've got some comments on 1562, and I'd attach a revised patch then - just a general question: should I then change Milestone to 1.6.0 and Version to 'devel'? 1562: I attach a possible patch. This could also be the default behavior to my mind, since the function caller can simply call numpy.squeeze if needed. Changing default behavior would probably break old code, Seems the fastest solution unless someone wants to change numpy.squeeze as well. But the present patch does not call np.squeeze any more at all, so I propose to restore that behaviour for X.ndim ndmin to remain really backwards compatible. It also seems easier to code when making the default ndmin=0. Cheers, Derek ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
Hi again, On 26 Mar 2011, at 15:20, Derek Homeier wrote: 1562: I attach a possible patch. This could also be the default behavior to my mind, since the function caller can simply call numpy.squeeze if needed. Changing default behavior would probably break old code, Seems the fastest solution unless someone wants to change numpy.squeeze as well. But the present patch does not call np.squeeze any more at all, so I propose to restore that behaviour for X.ndim ndmin to remain really backwards compatible. It also seems easier to code when making the default ndmin=0. I've got another somewhat general question: since it would probably be nice to have a test for this, I found one could simply add something along the lines of assert_equal(a.shape, x.shape) to test_io.py - test_shaped_dtype(self) or should one generally create a new test for such things (might still be better in this case, since test_shaped_dtype does not really test different ndim)? Cheers, Derek ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
Hi Derek! On 26. mars 2011, at 15.48, Derek Homeier wrote: Hi again, On 26 Mar 2011, at 15:20, Derek Homeier wrote: 1562: I attach a possible patch. This could also be the default behavior to my mind, since the function caller can simply call numpy.squeeze if needed. Changing default behavior would probably break old code, Seems the fastest solution unless someone wants to change numpy.squeeze as well. But the present patch does not call np.squeeze any more at all, so I propose to restore that behaviour for X.ndim ndmin to remain really backwards compatible. It also seems easier to code when making the default ndmin=0. I've got another somewhat general question: since it would probably be nice to have a test for this, I found one could simply add something along the lines of assert_equal(a.shape, x.shape) to test_io.py - test_shaped_dtype(self) or should one generally create a new test for such things (might still be better in this case, since test_shaped_dtype does not really test different ndim)? Cheers, Derek It would be nice to see your patch. I uploaded all of mine as mentioned. I'm no testing expert, but I am sure someone else will comment on it. Paul. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Array views
Hello, Say I have a few 1d arrays and one 2d array which columns I want to be the 1d arrays. I also want all the a's arrays to share the *same data* with the b array. If I call my 1d arrays a1, a2, etc. and my 2d array b, then b[:,0] = a1[:] b[:,1] = a2[:] ... won't work because apparently copying occurs. I tried it the other way around i.e. a1 = b[:,0] a2 = b[:,1] ... and it works but that doesn't help me for my problem. Is there a way to reformulate the first code snippet above but with shallow copying? Thanks, -- Hugo Gagnon -- Hugo Gagnon ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On Sat, 26 Mar 2011 13:10:42 -0400, Hugo Gagnon wrote: [clip] a1 = b[:,0] a2 = b[:,1] ... and it works but that doesn't help me for my problem. Is there a way to reformulate the first code snippet above but with shallow copying? No. You need an 2-D array to own the data. The second way is the approach to use if you want to share the data. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
Hi, I am also interested in this. In my application there is a large 2d array, lets call it 'b' to keep the notation consistent in the thread. b's columns need to be recomputed often. Ideally this re-computation happens in a function. Lets call that function updater(b, col_index): The simplest example is where updater(b, col_index) is a matrix vector multiply, where the matrix or the vector changes. Is there anyway apart from using ufuncs that I can make updater() write the result directly in b and not create a new temporary column that is then copied into b ? Say for the matrix vector multiply example. I can write the matrix vector product in terms of ufuncs but will lose out in terms of speed. In the best case scenario I would like to maintain 'b' in a csr sparse matrix form, as 'b' participates in a matrix vector multiply. I think csr would be asking for too much, but even ccs should help. I dont want to clutter this thread with the sparsity issues though, any solution to the original question or pointers to solutions would be appreciated. Thanks --srean On Sat, Mar 26, 2011 at 12:10 PM, Hugo Gagnon sourceforge.nu...@user.fastmail.fm wrote: Hello, Say I have a few 1d arrays and one 2d array which columns I want to be the 1d arrays. I also want all the a's arrays to share the *same data* with the b array. If I call my 1d arrays a1, a2, etc. and my 2d array b, then b[:,0] = a1[:] b[:,1] = a2[:] ... won't work because apparently copying occurs. I tried it the other way around i.e. a1 = b[:,0] a2 = b[:,1] ... and it works but that doesn't help me for my problem. Is there a way to reformulate the first code snippet above but with shallow copying? Thanks, -- Hugo Gagnon -- Hugo Gagnon ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On 3/26/11 10:32 AM, srean wrote: I am also interested in this. In my application there is a large 2d array, lets call it 'b' to keep the notation consistent in the thread. b's columns need to be recomputed often. Ideally this re-computation happens in a function. Lets call that function updater(b, col_index): The simplest example is where updater(b, col_index) is a matrix vector multiply, where the matrix or the vector changes. Is there anyway apart from using ufuncs that I can make updater() write the result directly in b and not create a new temporary column that is then copied into b ? Say for the matrix vector multiply example. Probably not -- the trick is that when an array is a view of a slice of another array, it may not be laid out in memory in a way that other libs (like LAPACK, BLAS, etc) require, so the data needs to be copied to call those routines. To understand all this, you'll need to study up a bit on how numpy arrays lay out and access the memory that they use: they use a concept of strided memory. It's very powerful and flexible, but most other numeric libs can't use those same data structures. Im not sure what a good doc is to read to learn about this -- I learned it from messing with the C API. TAke a look at any docs that talk about strides, and maybe playing with the stride tricks tools will help. A simple example: In [3]: a = np.ones((3,4)) In [4]: a Out[4]: array([[ 1., 1., 1., 1.], [ 1., 1., 1., 1.], [ 1., 1., 1., 1.]]) In [5]: a.flags Out[5]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False So a is a (3,4) array, stored in C_contiguous fashion, jsut like a regular old C array. A lib expecting data in this fashion could use the data pointer just like regular C code. In [6]: a.strides Out[6]: (32, 8) this means is is 32 bytes from the start of one row to the next, and 8 bytes from the start of one element to the next -- which makes sense for a 64bit double. In [7]: b = a[:,1] In [10]: b Out[10]: array([ 1., 1., 1.]) so b is a 1-d array with three elements. In [8]: b.flags Out[8]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False but it is NOT C_Contiguous - the data is laid out differently that a standard C array. In [9]: b.strides Out[9]: (32,) so this means that it is 32 bytes from one element to the next -- for a 8 byte data type. This is because the elements are each one element in a row of the a array -- they are not all next to each other. A regular C library generally won't be able to work with data laid out like this. HTH, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On 3/26/11 10:12 AM, Pauli Virtanen wrote: On Sat, 26 Mar 2011 13:10:42 -0400, Hugo Gagnon wrote: [clip] a1 = b[:,0] a2 = b[:,1] ... and it works but that doesn't help me for my problem. Is there a way to reformulate the first code snippet above but with shallow copying? No. You need an 2-D array to own the data. The second way is the approach to use if you want to share the data. exactly -- but to clarify, it's not just about ownership, it's about layout of the data in memory. the data in a numpy array needs to be laid out in memory as one block, with consitent strides from one element to the next, one row to the next, etc. When you create an array from scratch (like your 2-d array here), you get one big block of memory. If you create each row separately, they each have their own block of memory that are unrelated -- there is no way to put those together into one block with consistent strides. So you need to create that big block first (the 2-d array), then you can reference parts of it for each row. See my previous note for a bit more discussion. Oh, and maybe the little presentation and sample code I gave to the Seattle Python Interest group will help: http://www.seapig.org/November2010Notes -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
Hi Christopher, thanks for taking the time to reply at length. I do understand the concept of striding in general but was not familiar with the Numpy way of accessing that information. So thanks for pointing me to .flag and .stride. That said, BLAS/LAPACK do have apis that take the stride length into account. But for sparse arrays I think its a hopeless situation. That is a bummer, because sparse is what I need. Oh well, I will probably do it in C++ -- srean p.s. I hope top posting is not frowned upon here. If so, I will keep that in mind in my future posts. On Sat, Mar 26, 2011 at 1:31 PM, Christopher Barker chris.bar...@noaa.govwrote: Probably not -- the trick is that when an array is a view of a slice of another array, it may not be laid out in memory in a way that other libs (like LAPACK, BLAS, etc) require, so the data needs to be copied to call those routines. To understand all this, you'll need to study up a bit on how numpy arrays lay out and access the memory that they use: they use a concept of strided memory. It's very powerful and flexible, but most other numeric libs can't use those same data structures. Im not sure what a good doc is to read to learn about this -- I learned it from messing with the C API. TAke a look at any docs that talk about strides, and maybe playing with the stride tricks tools will help. A simple example: In [3]: a = np.ones((3,4)) In [4]: a Out[4]: array([[ 1., 1., 1., 1.], [ 1., 1., 1., 1.], [ 1., 1., 1., 1.]]) In [5]: a.flags Out[5]: C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False So a is a (3,4) array, stored in C_contiguous fashion, jsut like a regular old C array. A lib expecting data in this fashion could use the data pointer just like regular C code. In [6]: a.strides Out[6]: (32, 8) this means is is 32 bytes from the start of one row to the next, and 8 bytes from the start of one element to the next -- which makes sense for a 64bit double. In [7]: b = a[:,1] In [10]: b Out[10]: array([ 1., 1., 1.]) so b is a 1-d array with three elements. In [8]: b.flags Out[8]: C_CONTIGUOUS : False F_CONTIGUOUS : False OWNDATA : False WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False but it is NOT C_Contiguous - the data is laid out differently that a standard C array. In [9]: b.strides Out[9]: (32,) so this means that it is 32 bytes from one element to the next -- for a 8 byte data type. This is because the elements are each one element in a row of the a array -- they are not all next to each other. A regular C library generally won't be able to work with data laid out like this. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On Sat, 26 Mar 2011 12:32:24 -0500, srean wrote: [clip] Is there anyway apart from using ufuncs that I can make updater() write the result directly in b and not create a new temporary column that is then copied into b? Say for the matrix vector multiply example. I can write the matrix vector product in terms of ufuncs but will lose out in terms of speed. Well, you can e.g. write def updater(b, col_idx): b[:,col_idx] *= 3 # - modifies b[:,col_idx] in place And ditto for sparse matrices --- but maybe this is not what you asked. If you want to have control over temporaries, you can make use of the out= argument of ufuncs (`numpy.dot` will gain it in 1.6.1 --- you can call LAPACK routines from scipy.lib in the meantime, if your data is in Fortran order). Also numexpr is probably able to write the output directly to a given array --- using it is an alternative way to avoid temporaries, and probably easier to write than doing things via the out= arguments. For sparse matrices, things then depend on how they are laid out in memory. You can probably alter the `.data` attribute of the arrays directly, if you know how the underlying representation works. -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On Sat, 26 Mar 2011 19:13:43 +, Pauli Virtanen wrote: [clip] If you want to have control over temporaries, you can make use of the out= argument of ufuncs (`numpy.dot` will gain it in 1.6.1 --- you can call LAPACK routines from scipy.lib in the meantime, if your data is in Fortran order). Like so: # Fortran-order for efficient DGEMM -- each column must be contiguous A = np.random.randn(4,4).copy('F') b = np.random.randn(4,10).copy('F') def updater(b, col_idx): # This will work in Numpy 1.6.1 dot(A, b[:,col_idx].copy(), out=b[:,col_idx]) In the meantime you can do A = np.random.randn(4,4).copy('F') b = np.random.randn(4,10).copy('F') from scipy.lib.blas import get_blas_funcs gemm, = get_blas_funcs(['gemm'], [A, b]) # get correct type func def updater(b, col_idx): bcol = b[:,col_idx] c = gemm(1.0, A, bcol.copy(), 0.0, bcol, overwrite_c=True) assert c is bcol # check that it didn't make copies! Note that DGEMM and `dot` cannot do in-place multiplication -- at least the BLAS library I have fails when the B and C arguments point to the same memory, so you'll anyway end up with one temporary. (This has nothing to do with Scipy -- same occurs in Fortran). -- Pauli Virtanen ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
Ah! very nice. I did not know that numpy-1.6.1 supports in place 'dot', and neither the fact that you could access the underlying BLAS functions like so. This is pretty neat. Thanks. Now I at least have an idea how the sparse version might work. If I get time I will probably give numpy-1.6.1 a shot. I already have the MKL libraries thanks to free version of epd for students. On Sat, Mar 26, 2011 at 2:34 PM, Pauli Virtanen p...@iki.fi wrote: Like so: # Fortran-order for efficient DGEMM -- each column must be contiguous A = np.random.randn(4,4).copy('F') b = np.random.randn(4,10).copy('F') def updater(b, col_idx): # This will work in Numpy 1.6.1 dot(A, b[:,col_idx].copy(), out=b[:,col_idx]) In the meantime you can do A = np.random.randn(4,4).copy('F') b = np.random.randn(4,10).copy('F') from scipy.lib.blas import get_blas_funcs gemm, = get_blas_funcs(['gemm'], [A, b]) # get correct type func def updater(b, col_idx): bcol = b[:,col_idx] c = gemm(1.0, A, bcol.copy(), 0.0, bcol, overwrite_c=True) assert c is bcol # check that it didn't make copies! ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array views
On Sat, Mar 26, 2011 at 3:16 PM, srean srean.l...@gmail.com wrote: Ah! very nice. I did not know that numpy-1.6.1 supports in place 'dot', In place is perhaps not the right word, I meant in a specified location ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
On Sat, Mar 26, 2011 at 8:53 AM, Paul Anton Letnes paul.anton.let...@gmail.com wrote: Hi Derek! On 26. mars 2011, at 15.48, Derek Homeier wrote: Hi again, On 26 Mar 2011, at 15:20, Derek Homeier wrote: 1562: I attach a possible patch. This could also be the default behavior to my mind, since the function caller can simply call numpy.squeeze if needed. Changing default behavior would probably break old code, Seems the fastest solution unless someone wants to change numpy.squeeze as well. But the present patch does not call np.squeeze any more at all, so I propose to restore that behaviour for X.ndim ndmin to remain really backwards compatible. It also seems easier to code when making the default ndmin=0. I've got another somewhat general question: since it would probably be nice to have a test for this, I found one could simply add something along the lines of assert_equal(a.shape, x.shape) to test_io.py - test_shaped_dtype(self) or should one generally create a new test for such things (might still be better in this case, since test_shaped_dtype does not really test different ndim)? Cheers, Derek It would be nice to see your patch. I uploaded all of mine as mentioned. I'm no testing expert, but I am sure someone else will comment on it. I put all these patches together at https://github.com/charris/numpy/tree/loadtxt-savetxt. Please pull from there to continue work on loadtxt/savetxt so as to avoid conflicts in the patches. One of the numpy tests is failing, I assume from patch conflicts, and more tests for the tickets are needed in any case. Also, new keywords should be added to the end, not put in the middle of existing keywords. I haven't reviewed the patches, just tried to get them organized. Also, I have Derek as the author on all of them, that can be changed if it is decided the credit should go elsewhere ;) Thanks for the work you all have been doing on these tickets. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] loadtxt/savetxt tickets
Hi Paul, having had a look at the other tickets you dug up, My opinions are my own, and in detail, they are: 1752: I attach a possible patch. FWIW, I agree with the request. The patch is written to be compatible with the fix in ticket #1562, but I did not test that yet. Tested, see also my comments on Trac. 1731: This seems like a rather trivial feature enhancement. I attach a possible patch. Agreed. Haven't tested it though. 1616: The suggested patch seems reasonable to me, but I do not have a full list of what objects loadtxt supports today as opposed to what this patch will support. 1562: I attach a possible patch. This could also be the default behavior to my mind, since the function caller can simply call numpy.squeeze if needed. Changing default behavior would probably break old code, however. See comments on Trac as well. 1458: The fix suggested in the ticket seems reasonable, but I have never used record arrays, so I am not sure of this. There were some issues with Python3, and I also had some general reservations as noted on Trac - basically, it makes 'unpack' equivalent to transposing for 2D-arrays, but to splitting into fields for 1D-recarrays. My question was, what's going to happen when you get to 2D-recarrays? Currently this is not an issue since loadtxt can only read 2D regular or 1D structured arrays. But this might change if the data block functionality (see below) were to be implemented - data could then be returned as 3D arrays or 2D structured arrays... Still, it would probably make most sense (or at least give the widest functionality) to have 'unpack=True' always return a list or iterator over columns. 1445: Adding this functionality could break old code, as some old datafiles may have empty lines which are now simply ignored. I do not think the feature is a good idea. It could rather be implemented as a separate function. 1107: I do not see the need for this enhancement. In my eyes, the usecols kwarg does this and more. Perhaps I am misunderstanding something here. Agree about #1445, and the bit about 'usecols' - 'numcols' would just provide a shorter call to e.g. read the first 20 columns of a file (well, not even that much over 'usecols=range(20)'...), don't think that justifies an extra argument. But the 'datablocks' provides something new, that a number of people seem to miss from e.g. gnuplot (including me, actually ;-). And it would also satisfy the request from #1445 without breaking backwards compatibility. I've been wondering if could instead specify the separator lines through the parameter, e.g. blocksep=['None', 'blank','invalid'], not sure if that would make it more useful... 1071: It is not clear to me whether loadtxt is supposed to support missing values in the fashion indicated in the ticket. In principle it should at least allow you to, by the use of converters as described there. The problem is, the default delimiter is described as 'any whitespace', which in the present implementation obviously includes any number of blanks or tabs. These are therefore treated differently from delimiters like ',' or ''. I'd reckon there are too many people actually relying on this behaviour to silently change it (e.g. I know plenty of tables with columns separated by either one or several tabs depending on the length of the previous entry). But the tab is apparently also treated differently if explicitly specified with delimiter='\t' - and in that case using a converter à la {2: lambda s: float(s or 'Nan')} is working for fields in the middle of the line, but not at the end - clearly warrants improvement. I've prepared a patch working for Python3 as well. 1163: 1565: These tickets seem to have the same origin of the problem. I attach one possible patch. The previously suggested patches that I've seen will not correctly convert floats to ints, which I believe my patch will. +1, though I am a bit concerned that prompting to raise a ValueError for every element could impede performance. I'd probably still enclose it into an if issubclass(typ, np.uint64) or issubclass(typ, np.int64): just like in npio.patch. I also thought one might switch to int(float128(x)) in that case, but at least for the given examples float128 cannot convert with more accuracy than float64 (even on PowerPC ;-). There were some dissenting opinions that trying to read a float into an int should generally throw an exception though... And Chuck just beat me... On 26 Mar 2011, at 21:25, Charles R Harris wrote: I put all these patches together at https://github.com/charris/numpy/tree/loadtxt-savetxt . Please pull from there to continue work on loadtxt/savetxt so as to avoid conflicts in the patches. One of the numpy tests is failing, I assume from patch conflicts, and more tests for the tickets are needed in any case. Also,
Re: [Numpy-discussion] ValueError: Unknown format code 'g' for object of type 'str'
It turns out that Python 2.6 complex doesn't implement __format__, and that results in the problem. http://web.archiveorange.com/archive/v/jA6s92Ni29ENZpi4rpz5 I've disabled the complex formatting tests for 2.6 in commit 7d436cc8994f9efbc512. http://web.archiveorange.com/archive/v/jA6s92Ni29ENZpi4rpz5-Mark On Sat, Mar 26, 2011 at 3:34 AM, Nils Wagner nwag...@iam.uni-stuttgart.dewrote: numpy.__version__ '2.0.0.dev-10db259' == ERROR: Test the str.format method with NumPy scalar types -- Traceback (most recent call last): File /home/nwagner/local/lib64/python2.6/site-packages/nose-0.11.2.dev-py2.6.egg/nose/case.py, line 183, in runTest self.test(*self.arg) File /home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/decorators.py, line 146, in skipper_func return f(*args, **kwargs) File /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/tests/test_print.py, line 223, in test_scalar_format assert_equal(fmat.format(val), fmat.format(valtype(val)), ValueError: Unknown format code 'g' for object of type 'str' ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion