Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Oscar Benjamin
On Wed, Jan 22, 2014 at 05:53:26PM -0800, Chris Barker - NOAA Federal wrote: On Jan 22, 2014, at 1:13 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: It's not safe to stop removing the null bytes. This is how numpy determines the length of the strings in a dtype='S' array. The

[Numpy-discussion] IRR

2014-01-23 Thread Nathaniel Smith
Hey all, We have a PR languishing that fixes np.irr to handle negative rate-of-returns: https://github.com/numpy/numpy/pull/4210 I don't even know what IRR stands for, and it seems rather confusing from the discussion there. Anyone who knows something about the issues is invited to speak up...

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
On Thu, Jan 23, 2014 at 5:45 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On Wed, Jan 22, 2014 at 05:53:26PM -0800, Chris Barker - NOAA Federal wrote: On Jan 22, 2014, at 1:13 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: It's not safe to stop removing the null bytes. This

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
On Thu, Jan 23, 2014 at 5:45 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On Wed, Jan 22, 2014 at 05:53:26PM -0800, Chris Barker - NOAA Federal wrote: On Jan 22, 2014, at 1:13 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: It's not safe to stop removing the null bytes. This

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
On Thu, Jan 23, 2014 at 10:41 AM, josef.p...@gmail.com wrote: On Thu, Jan 23, 2014 at 5:45 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On Wed, Jan 22, 2014 at 05:53:26PM -0800, Chris Barker - NOAA Federal wrote: On Jan 22, 2014, at 1:13 PM, Oscar Benjamin oscar.j.benja...@gmail.com

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
On Thu, Jan 23, 2014 at 11:43 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On Thu, Jan 23, 2014 at 11:23:09AM -0500, josef.p...@gmail.com wrote: another curious example, encode utf-8 to latin-1 bytes b array(['Õsc', 'zxc'], dtype='U3') b[0].encode('utf8') b'\xc3\x95sc'

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
On Thu, Jan 23, 2014 at 11:58 AM, josef.p...@gmail.com wrote: On Thu, Jan 23, 2014 at 11:43 AM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On Thu, Jan 23, 2014 at 11:23:09AM -0500, josef.p...@gmail.com wrote: another curious example, encode utf-8 to latin-1 bytes b array(['Õsc',

[Numpy-discussion] cannot decode 'S'

2014-01-23 Thread josef . pktd
truncating null bytes in 'S' breaks decoding that needs them a = np.array([si.encode('utf-16LE') for si in ['Õsc', 'zxc']], dtype='S') a array([b'\xd5\x00s\x00c', b'z\x00x\x00c'], dtype='|S6') [ai.decode('utf-16LE') for ai in a] Traceback (most recent call last): File pyshell#118,

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Oscar Benjamin
On 23 January 2014 17:42, josef.p...@gmail.com wrote: On Thu, Jan 23, 2014 at 12:13 PM, josef.p...@gmail.com wrote: On Thu, Jan 23, 2014 at 11:58 AM, josef.p...@gmail.com wrote: No, a view doesn't change the memory, it just changes the interpretation and there shouldn't be any conversion

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Chris Barker
Thanks for poking into this all. I've lost track a bit, but I think: The 'S' type is clearly broken on py3 (at least). I think that gives us room to change it, and backward compatibly is less of an issue because it's broken already -- do we need to preserve bug-for-bug compatibility? Maybe, but I

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
On Thu, Jan 23, 2014 at 1:49 PM, Chris Barker chris.bar...@noaa.gov wrote: s = 'a string' np.array((s,), dtype='S')[0] == s Gives you False, rather than True on py2. This is because a py3 string is translated to the 'S' type (presumable with the default encoding, another maybe not a good

Re: [Numpy-discussion] cannot decode 'S'

2014-01-23 Thread Chris Barker
Josef, Nice find -- another reason why 'S' can NOT be used a-is for arbitrary bytes. See the other thread for my proposals about that. messy workaround (arrays in contrast to scalars are not truncated in `tostring`) [a[i:i+1].tostring().decode('utf-16LE') for i in range(len(a))] ['Õsc',

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Chris Barker
On Thu, Jan 23, 2014 at 11:18 AM, josef.p...@gmail.com wrote: I think this is just inconsistent casting rules in numpy, numpy should either refuse to assign the wrong type, instead of using the repr as in some of the earlier examples of Oscar s = np.inf np.array((s,), dtype=int)[0] == s

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
numpy arrays need a decode and encode method I'm not sure that they do. Rather there needs to be a text dtype that knows what encoding to use in order to have a binary interface as exposed by .tostring() and friends and but produce unicode strings when indexed from Python code. Having both

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
On Thu, Jan 23, 2014 at 2:45 PM, Chris Barker chris.bar...@noaa.gov wrote: On Thu, Jan 23, 2014 at 11:18 AM, josef.p...@gmail.com wrote: I think this is just inconsistent casting rules in numpy, numpy should either refuse to assign the wrong type, instead of using the repr as in some of the

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread josef . pktd
On Thu, Jan 23, 2014 at 1:36 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 23 January 2014 17:42, josef.p...@gmail.com wrote: On Thu, Jan 23, 2014 at 12:13 PM, josef.p...@gmail.com wrote: On Thu, Jan 23, 2014 at 11:58 AM, josef.p...@gmail.com wrote: No, a view doesn't change the

[Numpy-discussion] Text array dtype for numpy

2014-01-23 Thread Oscar Benjamin
There have been a few threads discussing the problems of how to do text with numpy arrays in Python 3. To make a slightly more concrete proposal, I've implemented a pure Python ndarray subclass that I believe can consistently handle text/bytes in Python 3. It is intended to be an illustration

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Chris Barker
On Thu, Jan 23, 2014 at 12:10 PM, josef.p...@gmail.com wrote: Exactly -- but what should those conversion/casting rules be? We can't decide that unless we decide if 'S' is for text or for arbitrary bytes -- it can't be both. I say text, that's what it's mostly trying to do already. But

Re: [Numpy-discussion] (no subject)

2014-01-23 Thread jennifer stone
Both scipy and numpy require GSOC candidates to have a pull request accepted as part of the application process. I'd suggest implementing a function not currently in scipy that you think would be useful. That would also help in finding a mentor for the summer. I'd also suggest getting

Re: [Numpy-discussion] (no subject)

2014-01-23 Thread jennifer stone
Scipy doesn't have a function for the Laplace transform, it has only a Laplace distribution in scipy.stats and a Laplace filter in scipy.ndimage. An inverse Laplace transform would be very welcome I'd think - it has real world applications, and there's no good implementation in any open source

[Numpy-discussion] De Bruijn sequence

2014-01-23 Thread Vincent Davis
I happen to be working with De Bruijn sequences. Is there any interest in this being part of numpy/scipy? https://gist.github.com/vincentdavis/8588879 Vincent Davis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Oscar Benjamin
On 23 January 2014 21:51, Chris Barker chris.bar...@noaa.gov wrote: However, I would prefer latin-1 -- that way you might get garbage for the non-ascii parts, but it wouldn't raise an exception and it round-trips through encoding/decoding. And you would have a somewhat more useful subset --

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Chris Barker
On Thu, Jan 23, 2014 at 4:02 PM, Oscar Benjamin oscar.j.benja...@gmail.comwrote: On 23 January 2014 21:51, Chris Barker chris.bar...@noaa.gov wrote: However, I would prefer latin-1 -- that way you might get garbage for the non-ascii parts, but it wouldn't raise an exception and it

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Chris Barker
On Thu, Jan 23, 2014 at 3:56 PM, josef.p...@gmail.com wrote: I'm not sure anymore, after all these threads I think bytes should be bytes and strings should be strings exactly -- that's the py3 model, and I think we really soudl try to conform to it, it's really the only way to have a robust

Re: [Numpy-discussion] using loadtxt to load a text file in to a numpy array

2014-01-23 Thread Oscar Benjamin
On 24 January 2014 01:09, Chris Barker chris.bar...@noaa.gov wrote: On Thu, Jan 23, 2014 at 4:02 PM, Oscar Benjamin oscar.j.benja...@gmail.com wrote: On 23 January 2014 21:51, Chris Barker chris.bar...@noaa.gov wrote: However, I would prefer latin-1 -- that way you might get garbage for