[Numpy-discussion] ValueError: Unknown format code 'g' for object of type 'str'

2011-03-26 Thread Nils Wagner
 numpy.__version__
'2.0.0.dev-10db259'

==
ERROR: Test the str.format method with NumPy scalar types
--
Traceback (most recent call last):
   File 
/home/nwagner/local/lib64/python2.6/site-packages/nose-0.11.2.dev-py2.6.egg/nose/case.py,
 
line 183, in runTest
 self.test(*self.arg)
   File 
/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/decorators.py,
 
line 146, in skipper_func
 return f(*args, **kwargs)
   File 
/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/tests/test_print.py,
 
line 223, in test_scalar_format
 assert_equal(fmat.format(val), 
fmat.format(valtype(val)),
ValueError: Unknown format code 'g' for object of type 
'str'

  
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] loadtxt/savetxt tickets

2011-03-26 Thread Paul Anton Letnes
Hi!

I have had a look at the list of numpy.loadtxt tickets. I have never 
contributed to numpy before, so I may be doing stupid things - don't be afraid 
to let me know!

My opinions are my own, and in detail, they are:
1752:
I attach a possible patch. FWIW, I agree with the request. The patch is 
written to be compatible with the fix in ticket #1562, but I did not test that 
yet.
1731:
This seems like a rather trivial feature enhancement. I attach a possible 
patch.
1616:
The suggested patch seems reasonable to me, but I do not have a full list 
of what objects loadtxt supports today as opposed to what this patch will 
support.
1562:
I attach a possible patch. This could also be the default behavior to my 
mind, since the function caller can simply call numpy.squeeze if needed. 
Changing default behavior would probably break old code, however.
1458:
The fix suggested in the ticket seems reasonable, but I have never used 
record arrays, so I am not sure  of this.
1445:
Adding this functionality could break old code, as some old datafiles may 
have empty lines which are now simply ignored. I do not think the feature is a 
good idea. It could rather be implemented as a separate function.
1107:
I do not see the need for this enhancement. In my eyes, the usecols kwarg 
does this and more. Perhaps I am misunderstanding something here.
1071:
It is not clear to me whether loadtxt is supposed to support missing 
values in the fashion indicated in the ticket.
1163:
1565:
These tickets seem to have the same origin of the problem. I attach one 
possible patch. The previously suggested patches that I've seen will not 
correctly convert floats to ints, which I believe my patch will.

I hope you find this useful! Is there some way of submitting the patches for 
review in a more convenient fashion than e-mail?

Cheers,
Paul.



1562.patch
Description: Binary data


1163.patch
Description: Binary data


1731.patch
Description: Binary data


1752.patch
Description: Binary data



On 25. mars 2011, at 16.06, Charles R Harris wrote:

 Hi All,
 
 Could someone with an interest in loadtxt/savetxt look through the associated 
 tickets? A search on the tickets using either of those keys will return 
 fairly lengthy lists.
 
 Chuck
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] loadtxt/savetxt tickets

2011-03-26 Thread Pauli Virtanen
Hi,

Thanks!

On Sat, 26 Mar 2011 13:11:46 +0100, Paul Anton Letnes wrote:
[clip]
 I hope you find this useful! Is there some way of submitting the patches
 for review in a more convenient fashion than e-mail?

You can attach them on the trac to each ticket. That way they'll be easy 
to find later on.

Pauli

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] loadtxt/savetxt tickets

2011-03-26 Thread Derek Homeier
Hi,

On 26 Mar 2011, at 14:36, Pauli Virtanen wrote:

 On Sat, 26 Mar 2011 13:11:46 +0100, Paul Anton Letnes wrote:
 [clip]
 I hope you find this useful! Is there some way of submitting the  
 patches
 for review in a more convenient fashion than e-mail?

 You can attach them on the trac to each ticket. That way they'll be  
 easy
 to find later on.

I've got some comments on 1562, and I'd attach a revised patch then -  
just
a general question: should I then change Milestone to 1.6.0 and  
Version
to 'devel'?

 1562:
I attach a possible patch. This could also be the default  
 behavior to my mind, since the function caller can simply call  
 numpy.squeeze if needed. Changing default behavior would probably  
 break old code,

Seems the fastest solution unless someone wants to change numpy.squeeze
as well. But the present patch does not call np.squeeze any more at  
all, so I
propose to restore that behaviour for X.ndim  ndmin to remain really  
backwards
compatible. It also seems easier to code when making the default  
ndmin=0.

Cheers,
Derek

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] loadtxt/savetxt tickets

2011-03-26 Thread Derek Homeier
Hi again,

On 26 Mar 2011, at 15:20, Derek Homeier wrote:

 1562:
   I attach a possible patch. This could also be the default
 behavior to my mind, since the function caller can simply call
 numpy.squeeze if needed. Changing default behavior would probably
 break old code,

 Seems the fastest solution unless someone wants to change  
 numpy.squeeze
 as well. But the present patch does not call np.squeeze any more at
 all, so I
 propose to restore that behaviour for X.ndim  ndmin to remain really
 backwards
 compatible. It also seems easier to code when making the default
 ndmin=0.

I've got another somewhat general question: since it would probably be  
nice to
have a test for this, I found one could simply add something along the  
lines of

assert_equal(a.shape, x.shape)

to test_io.py - test_shaped_dtype(self)
or should one generally create a new test for such things (might still  
be better
in this case, since test_shaped_dtype does not really test different  
ndim)?

Cheers,
Derek


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] loadtxt/savetxt tickets

2011-03-26 Thread Paul Anton Letnes
Hi Derek!

On 26. mars 2011, at 15.48, Derek Homeier wrote:

 Hi again,
 
 On 26 Mar 2011, at 15:20, Derek Homeier wrote:
 
 1562:
  I attach a possible patch. This could also be the default
 behavior to my mind, since the function caller can simply call
 numpy.squeeze if needed. Changing default behavior would probably
 break old code,
 
 Seems the fastest solution unless someone wants to change  
 numpy.squeeze
 as well. But the present patch does not call np.squeeze any more at
 all, so I
 propose to restore that behaviour for X.ndim  ndmin to remain really
 backwards
 compatible. It also seems easier to code when making the default
 ndmin=0.
 
 I've got another somewhat general question: since it would probably be  
 nice to
 have a test for this, I found one could simply add something along the  
 lines of
 
 assert_equal(a.shape, x.shape)
 
 to test_io.py - test_shaped_dtype(self)
 or should one generally create a new test for such things (might still  
 be better
 in this case, since test_shaped_dtype does not really test different  
 ndim)?
 
 Cheers,
   Derek

It would be nice to see your patch. I uploaded all of mine as mentioned. I'm no 
testing expert, but I am sure someone else will comment on it.

Paul.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Array views

2011-03-26 Thread Hugo Gagnon
Hello,

Say I have a few 1d arrays and one 2d array which columns I want to be
the 1d arrays.
I also want all the a's arrays to share the *same data* with the b
array.
If I call my 1d arrays a1, a2, etc. and my 2d array b, then

b[:,0] = a1[:]
b[:,1] = a2[:]
...

won't work because apparently copying occurs.
I tried it the other way around i.e.

a1 = b[:,0]
a2 = b[:,1]
...

and it works but that doesn't help me for my problem.
Is there a way to reformulate the first code snippet above but with
shallow copying?

Thanks,
--
  Hugo Gagnon
--
  Hugo Gagnon

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread Pauli Virtanen
On Sat, 26 Mar 2011 13:10:42 -0400, Hugo Gagnon wrote:
[clip]
 a1 = b[:,0]
 a2 = b[:,1]
 ...
 
 and it works but that doesn't help me for my problem. Is there a way to
 reformulate the first code snippet above but with shallow copying?

No. You need an 2-D array to own the data. The second way is the 
approach to use if you want to share the data.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread srean
Hi,

 I am also interested in this. In my application there is a large 2d array,
lets call it 'b' to keep the notation consistent in the thread.  b's
columns need to be recomputed often. Ideally this re-computation happens in
a function. Lets call that function updater(b, col_index): The simplest
example is where
updater(b, col_index) is a matrix vector multiply, where the matrix or the
vector changes.

 Is there anyway apart from using ufuncs that I can make updater() write the
result directly in b and not create a new temporary column that is then
copied into b ?  Say for the matrix vector multiply example.
I can write the matrix vector product in terms of ufuncs but will lose out
in terms of speed.

In the best case scenario I would like to maintain 'b' in a csr sparse
matrix form, as 'b' participates in a matrix vector multiply. I think csr
would be asking for too much, but even ccs should help.  I dont want to
clutter this thread with the sparsity issues though, any solution to the
original question or pointers to solutions would be appreciated.

Thanks
  --srean

On Sat, Mar 26, 2011 at 12:10 PM, Hugo Gagnon 
sourceforge.nu...@user.fastmail.fm wrote:

 Hello,

 Say I have a few 1d arrays and one 2d array which columns I want to be
 the 1d arrays.
 I also want all the a's arrays to share the *same data* with the b
 array.
 If I call my 1d arrays a1, a2, etc. and my 2d array b, then

 b[:,0] = a1[:]
 b[:,1] = a2[:]
 ...

 won't work because apparently copying occurs.
 I tried it the other way around i.e.

 a1 = b[:,0]
 a2 = b[:,1]
 ...

 and it works but that doesn't help me for my problem.
 Is there a way to reformulate the first code snippet above but with
 shallow copying?

 Thanks,
 --
  Hugo Gagnon
 --
  Hugo Gagnon

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread Christopher Barker
On 3/26/11 10:32 AM, srean wrote:
   I am also interested in this. In my application there is a large 2d
 array, lets call it 'b' to keep the notation consistent in the thread.
 b's  columns need to be recomputed often. Ideally this re-computation
 happens in a function. Lets call that function updater(b, col_index):
 The simplest example is where
 updater(b, col_index) is a matrix vector multiply, where the matrix or
 the vector changes.

   Is there anyway apart from using ufuncs that I can make updater()
 write the result directly in b and not create a new temporary column
 that is then copied into b ?  Say for the matrix vector multiply example.

Probably not -- the trick is that when an array is a view of a slice of 
another array, it may not be laid out in memory in a way that other libs 
(like LAPACK, BLAS, etc) require, so the data needs to be copied to call 
those routines.

To understand all this, you'll need to study up a bit on how numpy 
arrays lay out and access the memory that they use: they use a concept 
of strided memory. It's very powerful and flexible, but most other 
numeric libs can't use those same data structures. Im not sure what a 
good doc is to read to learn about this -- I learned it from messing 
with the C API. TAke a look at any docs that talk about strides, and 
maybe playing with the stride tricks tools will help.

A simple example:

In [3]: a = np.ones((3,4))

In [4]: a
Out[4]:
array([[ 1.,  1.,  1.,  1.],
[ 1.,  1.,  1.,  1.],
[ 1.,  1.,  1.,  1.]])

In [5]: a.flags
Out[5]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

So a is a (3,4) array, stored in C_contiguous fashion, jsut like a 
regular old C array. A lib expecting data in this fashion could use 
the data pointer just like regular C code.

In [6]: a.strides
Out[6]: (32, 8)

this means is is 32 bytes from the start of one row to the next, and 8 
bytes from the start of one element to the next -- which makes sense for 
a 64bit double.


In [7]: b = a[:,1]

In [10]: b
Out[10]: array([ 1.,  1.,  1.])

so b is a 1-d array with three elements.

In [8]: b.flags
Out[8]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

but it is NOT C_Contiguous - the data is laid out differently that a 
standard C array.

In [9]: b.strides
Out[9]: (32,)

so this means that it is 32 bytes from one element to the next -- for a 
8 byte data type. This is because the elements are each one element in a 
row of the a array -- they are not all next to each other. A regular C 
library generally won't be able to work with data laid out like this.

HTH,

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread Christopher Barker
On 3/26/11 10:12 AM, Pauli Virtanen wrote:
 On Sat, 26 Mar 2011 13:10:42 -0400, Hugo Gagnon wrote:
 [clip]
 a1 = b[:,0]
 a2 = b[:,1]
 ...

 and it works but that doesn't help me for my problem. Is there a way to
 reformulate the first code snippet above but with shallow copying?

 No. You need an 2-D array to own the data. The second way is the
 approach to use if you want to share the data.

exactly -- but to clarify, it's not just about ownership, it's about 
layout of the data in memory. the data in a numpy array needs to be laid 
out in memory as one block, with consitent strides from one element to 
the next, one row to the next, etc.

When you create an array from scratch (like your 2-d array here), you 
get one big block of memory. If you create each row separately, they 
each have their own block of memory that are unrelated -- there is no 
way to put those together into one block with consistent strides.

So you need to create that big block first (the 2-d array), then you can 
reference parts of it for each row.

See my previous note for a bit more discussion.

Oh, and maybe the little presentation and sample code I gave to the 
Seattle Python Interest group will help:

http://www.seapig.org/November2010Notes

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread srean
Hi Christopher,

thanks for taking the time to reply at length. I do understand the concept
of striding in general but was not familiar with the  Numpy way of accessing
that information. So thanks for pointing me to .flag and .stride.

That said, BLAS/LAPACK do have apis that take the stride length into
account. But for sparse arrays I think its a hopeless situation. That is a
bummer, because sparse is what I need. Oh well, I will probably do it in C++

-- srean

p.s. I hope top posting is not frowned upon here. If so, I will keep that in
mind in my future posts.

On Sat, Mar 26, 2011 at 1:31 PM, Christopher Barker
chris.bar...@noaa.govwrote:


 Probably not -- the trick is that when an array is a view of a slice of
 another array, it may not be laid out in memory in a way that other libs
 (like LAPACK, BLAS, etc) require, so the data needs to be copied to call
 those routines.

 To understand all this, you'll need to study up a bit on how numpy
 arrays lay out and access the memory that they use: they use a concept
 of strided memory. It's very powerful and flexible, but most other
 numeric libs can't use those same data structures. Im not sure what a
 good doc is to read to learn about this -- I learned it from messing
 with the C API. TAke a look at any docs that talk about strides, and
 maybe playing with the stride tricks tools will help.

 A simple example:

 In [3]: a = np.ones((3,4))

 In [4]: a
 Out[4]:
 array([[ 1.,  1.,  1.,  1.],
[ 1.,  1.,  1.,  1.],
[ 1.,  1.,  1.,  1.]])

 In [5]: a.flags
 Out[5]:
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 So a is a (3,4) array, stored in C_contiguous fashion, jsut like a
 regular old C array. A lib expecting data in this fashion could use
 the data pointer just like regular C code.

 In [6]: a.strides
 Out[6]: (32, 8)

 this means is is 32 bytes from the start of one row to the next, and 8
 bytes from the start of one element to the next -- which makes sense for
 a 64bit double.


 In [7]: b = a[:,1]

 In [10]: b
 Out[10]: array([ 1.,  1.,  1.])

 so b is a 1-d array with three elements.

 In [8]: b.flags
 Out[8]:
   C_CONTIGUOUS : False
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

 but it is NOT C_Contiguous - the data is laid out differently that a
 standard C array.

 In [9]: b.strides
 Out[9]: (32,)

 so this means that it is 32 bytes from one element to the next -- for a
 8 byte data type. This is because the elements are each one element in a
 row of the a array -- they are not all next to each other. A regular C
 library generally won't be able to work with data laid out like this.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread Pauli Virtanen
On Sat, 26 Mar 2011 12:32:24 -0500, srean wrote:
[clip]
 Is there anyway apart from using ufuncs that I can make updater() write
 the result directly in b and not create a new temporary column that is
 then copied into b?  Say for the matrix vector multiply example. I can
 write the matrix vector product in terms of ufuncs but will lose out in
 terms of speed.

Well, you can e.g. write

def updater(b, col_idx):
   b[:,col_idx] *= 3 # - modifies b[:,col_idx] in place

And ditto for sparse matrices --- but maybe this is not what you asked.

If you want to have control over temporaries, you can make use of the 
out= argument of ufuncs (`numpy.dot` will gain it in 1.6.1 --- you can 
call LAPACK routines from scipy.lib in the meantime, if your data is in 
Fortran order).

Also numexpr is probably able to write the output directly to a given 
array --- using it is an alternative way to avoid temporaries, and 
probably easier to write than doing things via the out= arguments.

For sparse matrices, things then depend on how they are laid out in 
memory. You can probably alter the `.data` attribute of the arrays 
directly, if you know how the underlying representation works.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread Pauli Virtanen
On Sat, 26 Mar 2011 19:13:43 +, Pauli Virtanen wrote:
[clip]
 If you want to have control over temporaries, you can make use of the
 out= argument of ufuncs (`numpy.dot` will gain it in 1.6.1 --- you can
 call LAPACK routines from scipy.lib in the meantime, if your data is in
 Fortran order).

Like so:

# Fortran-order for efficient DGEMM -- each column must be contiguous
A = np.random.randn(4,4).copy('F')
b = np.random.randn(4,10).copy('F')

def updater(b, col_idx):
# This will work in Numpy 1.6.1
dot(A, b[:,col_idx].copy(), out=b[:,col_idx])

In the meantime you can do

A = np.random.randn(4,4).copy('F')
b = np.random.randn(4,10).copy('F')

from scipy.lib.blas import get_blas_funcs
gemm, = get_blas_funcs(['gemm'], [A, b]) # get correct type func

def updater(b, col_idx):
bcol = b[:,col_idx]
c = gemm(1.0, A, bcol.copy(), 0.0, bcol, overwrite_c=True)
assert c is bcol # check that it didn't make copies!

Note that DGEMM and `dot` cannot do in-place multiplication -- at least 
the BLAS library I have fails when the B and C arguments point to the 
same memory, so you'll anyway end up with one temporary. (This has 
nothing to do with Scipy -- same occurs in Fortran).

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread srean
Ah! very nice. I did not know that numpy-1.6.1 supports in place 'dot', and
neither the fact that you could access the underlying BLAS functions like
so. This is pretty neat. Thanks. Now I at least have an idea how the sparse
version might work.

If I get time I will probably give numpy-1.6.1 a shot. I already have the
MKL libraries thanks to free version of epd for students.


On Sat, Mar 26, 2011 at 2:34 PM, Pauli Virtanen p...@iki.fi wrote:


 Like so:

# Fortran-order for efficient DGEMM -- each column must be contiguous
A = np.random.randn(4,4).copy('F')
b = np.random.randn(4,10).copy('F')

def updater(b, col_idx):
 # This will work in Numpy 1.6.1
dot(A, b[:,col_idx].copy(), out=b[:,col_idx])

 In the meantime you can do

A = np.random.randn(4,4).copy('F')
b = np.random.randn(4,10).copy('F')

from scipy.lib.blas import get_blas_funcs
gemm, = get_blas_funcs(['gemm'], [A, b]) # get correct type func

def updater(b, col_idx):
 bcol = b[:,col_idx]
c = gemm(1.0, A, bcol.copy(), 0.0, bcol, overwrite_c=True)
assert c is bcol # check that it didn't make copies!

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array views

2011-03-26 Thread srean
On Sat, Mar 26, 2011 at 3:16 PM, srean srean.l...@gmail.com wrote:


 Ah! very nice. I did not know that numpy-1.6.1 supports in place 'dot',


In place is perhaps not the right word, I meant in a specified location
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] loadtxt/savetxt tickets

2011-03-26 Thread Charles R Harris
On Sat, Mar 26, 2011 at 8:53 AM, Paul Anton Letnes 
paul.anton.let...@gmail.com wrote:

 Hi Derek!

 On 26. mars 2011, at 15.48, Derek Homeier wrote:

  Hi again,
 
  On 26 Mar 2011, at 15:20, Derek Homeier wrote:
 
  1562:
   I attach a possible patch. This could also be the default
  behavior to my mind, since the function caller can simply call
  numpy.squeeze if needed. Changing default behavior would probably
  break old code,
 
  Seems the fastest solution unless someone wants to change
  numpy.squeeze
  as well. But the present patch does not call np.squeeze any more at
  all, so I
  propose to restore that behaviour for X.ndim  ndmin to remain really
  backwards
  compatible. It also seems easier to code when making the default
  ndmin=0.
 
  I've got another somewhat general question: since it would probably be
  nice to
  have a test for this, I found one could simply add something along the
  lines of
 
  assert_equal(a.shape, x.shape)
 
  to test_io.py - test_shaped_dtype(self)
  or should one generally create a new test for such things (might still
  be better
  in this case, since test_shaped_dtype does not really test different
  ndim)?
 
  Cheers,
Derek

 It would be nice to see your patch. I uploaded all of mine as mentioned.
 I'm no testing expert, but I am sure someone else will comment on it.


I put all these patches together at
https://github.com/charris/numpy/tree/loadtxt-savetxt. Please pull from
there to continue work on loadtxt/savetxt so as to avoid conflicts in the
patches. One of the numpy tests is failing, I assume from patch conflicts,
and more tests for the tickets are needed in any case. Also, new keywords
should be added to the end, not put in the middle of existing keywords.

I haven't reviewed the patches, just tried to get them organized. Also, I
have Derek as the author on all of them, that can be changed if it is
decided the credit should go elsewhere ;) Thanks for the work you all have
been doing on these tickets.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] loadtxt/savetxt tickets

2011-03-26 Thread Derek Homeier
Hi Paul,

having had a look at the other tickets you dug up,

 My opinions are my own, and in detail, they are:
 1752:
I attach a possible patch. FWIW, I agree with the request. The  
 patch is written to be compatible with the fix in ticket #1562, but  
 I did not test that yet.

Tested, see also my comments on Trac.

 1731:
This seems like a rather trivial feature enhancement. I attach a  
 possible patch.

Agreed. Haven't tested it though.

 1616:
The suggested patch seems reasonable to me, but I do not have a  
 full list of what objects loadtxt supports today as opposed to what  
 this patch will support.

 1562:
I attach a possible patch. This could also be the default  
 behavior to my mind, since the function caller can simply call  
 numpy.squeeze if needed. Changing default behavior would probably  
 break old code, however.

See comments on Trac as well.

 1458:
The fix suggested in the ticket seems reasonable, but I have  
 never used record arrays, so I am not sure  of this.

There were some issues with Python3, and I also had some general  
reservations
as noted on Trac - basically, it makes 'unpack' equivalent to  
transposing for 2D-arrays,
but to splitting into fields for 1D-recarrays. My question was, what's  
going to happen
when you get to 2D-recarrays? Currently this is not an issue since  
loadtxt can only
read 2D regular or 1D structured arrays. But this might change if the  
data block
functionality (see below) were to be implemented - data could then be  
returned as
3D arrays or 2D structured arrays... Still, it would probably make  
most sense (or at
least give the widest functionality) to have 'unpack=True' always  
return a list or iterator
over columns.

 1445:
Adding this functionality could break old code, as some old  
 datafiles may have empty lines which are now simply ignored. I do  
 not think the feature is a good idea. It could rather be implemented  
 as a separate function.
 1107:
I do not see the need for this enhancement. In my eyes, the  
 usecols kwarg does this and more. Perhaps I am misunderstanding  
 something here.

Agree about #1445, and the bit about 'usecols' - 'numcols' would just  
provide a
shorter call to e.g. read the first 20 columns of a file (well, not  
even that much
over 'usecols=range(20)'...), don't think that justifies an extra  
argument.
But the 'datablocks' provides something new, that a number of people  
seem
to miss from e.g. gnuplot (including me, actually ;-). And it would  
also satisfy the
request from #1445 without breaking backwards compatibility.
I've been wondering if could instead specify the separator lines  
through the
parameter, e.g. blocksep=['None', 'blank','invalid'], not sure if  
that would make
it more useful...

 1071:
   It is not clear to me whether loadtxt is supposed to support  
 missing values in the fashion indicated in the ticket.

In principle it should at least allow you to, by the use of converters  
as described there.
The problem is, the default delimiter is described as 'any  
whitespace', which in the
present implementation obviously includes any number of blanks or  
tabs. These
are therefore treated differently from delimiters like ',' or ''. I'd  
reckon there are
too many people actually relying on this behaviour to silently change it
(e.g. I know plenty of tables with columns separated by either one or  
several
tabs depending on the length of the previous entry). But the tab is  
apparently also
treated differently if explicitly specified with delimiter='\t' -  
and in that case using
a converter à la {2: lambda s: float(s or 'Nan')} is working for  
fields in the middle of
the line, but not at the end - clearly warrants improvement. I've  
prepared a patch
working for Python3 as well.

 1163:
 1565:
These tickets seem to have the same origin of the problem. I  
 attach one possible patch. The previously suggested patches that  
 I've seen will not correctly convert floats to ints, which I believe  
 my patch will.

+1, though I am a bit concerned that prompting to raise a ValueError  
for every
element could impede performance. I'd probably still enclose it into an
if issubclass(typ, np.uint64) or issubclass(typ, np.int64):
just like in npio.patch. I also thought one might switch to  
int(float128(x)) in that
case, but at least for the given examples float128 cannot convert with  
more
accuracy than float64 (even on PowerPC ;-).
There were some dissenting opinions that trying to read a float into  
an int should
generally throw an exception though...

And Chuck just beat me...

On 26 Mar 2011, at 21:25, Charles R Harris wrote:

 I put all these patches together at 
 https://github.com/charris/numpy/tree/loadtxt-savetxt 
 . Please pull from there to continue work on loadtxt/savetxt so as  
 to avoid conflicts in the patches. One of the numpy tests is  
 failing, I assume from patch conflicts, and more tests for the  
 tickets are needed in any case. Also, 

Re: [Numpy-discussion] ValueError: Unknown format code 'g' for object of type 'str'

2011-03-26 Thread Mark Wiebe
It turns out that Python 2.6 complex doesn't implement __format__, and that
results in the problem.

http://web.archiveorange.com/archive/v/jA6s92Ni29ENZpi4rpz5

I've disabled the complex formatting tests for 2.6 in
commit 7d436cc8994f9efbc512.

http://web.archiveorange.com/archive/v/jA6s92Ni29ENZpi4rpz5-Mark

On Sat, Mar 26, 2011 at 3:34 AM, Nils Wagner
nwag...@iam.uni-stuttgart.dewrote:

  numpy.__version__
 '2.0.0.dev-10db259'

 ==
 ERROR: Test the str.format method with NumPy scalar types
 --
 Traceback (most recent call last):
   File

 /home/nwagner/local/lib64/python2.6/site-packages/nose-0.11.2.dev-py2.6.egg/nose/case.py,
 line 183, in runTest
 self.test(*self.arg)
   File

 /home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/decorators.py,
 line 146, in skipper_func
 return f(*args, **kwargs)
   File

 /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/tests/test_print.py,
 line 223, in test_scalar_format
 assert_equal(fmat.format(val),
 fmat.format(valtype(val)),
 ValueError: Unknown format code 'g' for object of type
 'str'


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion