On Wed, Jan 22, 2014 at 05:53:26PM -0800, Chris Barker - NOAA Federal wrote:
On Jan 22, 2014, at 1:13 PM, Oscar Benjamin oscar.j.benja...@gmail.com
wrote:
It's not safe to stop removing the null bytes. This is how numpy determines
the length of the strings in a dtype='S' array. The
Hey all,
We have a PR languishing that fixes np.irr to handle negative rate-of-returns:
https://github.com/numpy/numpy/pull/4210
I don't even know what IRR stands for, and it seems rather confusing
from the discussion there. Anyone who knows something about the issues
is invited to speak up...
On Thu, Jan 23, 2014 at 5:45 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
On Wed, Jan 22, 2014 at 05:53:26PM -0800, Chris Barker - NOAA Federal wrote:
On Jan 22, 2014, at 1:13 PM, Oscar Benjamin oscar.j.benja...@gmail.com
wrote:
It's not safe to stop removing the null bytes. This
On Thu, Jan 23, 2014 at 5:45 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
On Wed, Jan 22, 2014 at 05:53:26PM -0800, Chris Barker - NOAA Federal wrote:
On Jan 22, 2014, at 1:13 PM, Oscar Benjamin oscar.j.benja...@gmail.com
wrote:
It's not safe to stop removing the null bytes. This
On Thu, Jan 23, 2014 at 10:41 AM, josef.p...@gmail.com wrote:
On Thu, Jan 23, 2014 at 5:45 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
On Wed, Jan 22, 2014 at 05:53:26PM -0800, Chris Barker - NOAA Federal wrote:
On Jan 22, 2014, at 1:13 PM, Oscar Benjamin oscar.j.benja...@gmail.com
On Thu, Jan 23, 2014 at 11:43 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
On Thu, Jan 23, 2014 at 11:23:09AM -0500, josef.p...@gmail.com wrote:
another curious example, encode utf-8 to latin-1 bytes
b
array(['Õsc', 'zxc'],
dtype='U3')
b[0].encode('utf8')
b'\xc3\x95sc'
On Thu, Jan 23, 2014 at 11:58 AM, josef.p...@gmail.com wrote:
On Thu, Jan 23, 2014 at 11:43 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
On Thu, Jan 23, 2014 at 11:23:09AM -0500, josef.p...@gmail.com wrote:
another curious example, encode utf-8 to latin-1 bytes
b
array(['Õsc',
truncating null bytes in 'S' breaks decoding that needs them
a = np.array([si.encode('utf-16LE') for si in ['Õsc', 'zxc']], dtype='S')
a
array([b'\xd5\x00s\x00c', b'z\x00x\x00c'],
dtype='|S6')
[ai.decode('utf-16LE') for ai in a]
Traceback (most recent call last):
File pyshell#118,
On 23 January 2014 17:42, josef.p...@gmail.com wrote:
On Thu, Jan 23, 2014 at 12:13 PM, josef.p...@gmail.com wrote:
On Thu, Jan 23, 2014 at 11:58 AM, josef.p...@gmail.com wrote:
No, a view doesn't change the memory, it just changes the
interpretation and there shouldn't be any conversion
Thanks for poking into this all. I've lost track a bit, but I think:
The 'S' type is clearly broken on py3 (at least). I think that gives us
room to change it, and backward compatibly is less of an issue because it's
broken already -- do we need to preserve bug-for-bug compatibility? Maybe,
but I
On Thu, Jan 23, 2014 at 1:49 PM, Chris Barker chris.bar...@noaa.gov wrote:
s = 'a string'
np.array((s,), dtype='S')[0] == s
Gives you False, rather than True on py2. This is because a py3 string is
translated to the 'S' type (presumable with the default encoding, another
maybe not a good
Josef,
Nice find -- another reason why 'S' can NOT be used a-is for arbitrary
bytes.
See the other thread for my proposals about that.
messy workaround (arrays in contrast to scalars are not truncated in
`tostring`)
[a[i:i+1].tostring().decode('utf-16LE') for i in range(len(a))]
['Õsc',
On Thu, Jan 23, 2014 at 11:18 AM, josef.p...@gmail.com wrote:
I think this is just inconsistent casting rules in numpy,
numpy should either refuse to assign the wrong type, instead of using
the repr as in some of the earlier examples of Oscar
s = np.inf
np.array((s,), dtype=int)[0] == s
numpy arrays need a decode and encode method
I'm not sure that they do. Rather there needs to be a text dtype that
knows what encoding to use in order to have a binary interface as
exposed by .tostring() and friends and but produce unicode strings
when indexed from Python code. Having both
On Thu, Jan 23, 2014 at 2:45 PM, Chris Barker chris.bar...@noaa.gov wrote:
On Thu, Jan 23, 2014 at 11:18 AM, josef.p...@gmail.com wrote:
I think this is just inconsistent casting rules in numpy,
numpy should either refuse to assign the wrong type, instead of using
the repr as in some of the
On Thu, Jan 23, 2014 at 1:36 PM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
On 23 January 2014 17:42, josef.p...@gmail.com wrote:
On Thu, Jan 23, 2014 at 12:13 PM, josef.p...@gmail.com wrote:
On Thu, Jan 23, 2014 at 11:58 AM, josef.p...@gmail.com wrote:
No, a view doesn't change the
There have been a few threads discussing the problems of how to do
text with numpy arrays in Python 3.
To make a slightly more concrete proposal, I've implemented a pure
Python ndarray subclass that I believe can consistently handle
text/bytes in Python 3. It is intended to be an illustration
On Thu, Jan 23, 2014 at 12:10 PM, josef.p...@gmail.com wrote:
Exactly -- but what should those conversion/casting rules be? We can't
decide that unless we decide if 'S' is for text or for arbitrary bytes
-- it
can't be both. I say text, that's what it's mostly trying to do already.
But
Both scipy and numpy require GSOC
candidates to have a pull request accepted as part of the application
process. I'd suggest implementing a function not currently in scipy that
you think would be useful. That would also help in finding a mentor for the
summer. I'd also suggest getting
Scipy doesn't have a function for the Laplace transform, it has only a
Laplace distribution in scipy.stats and a Laplace filter in scipy.ndimage.
An inverse Laplace transform would be very welcome I'd think - it has real
world applications, and there's no good implementation in any open source
I happen to be working with De Bruijn sequences. Is there any interest in
this being part of numpy/scipy?
https://gist.github.com/vincentdavis/8588879
Vincent Davis
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
On 23 January 2014 21:51, Chris Barker chris.bar...@noaa.gov wrote:
However, I would prefer latin-1 -- that way you might get garbage for the
non-ascii parts, but it wouldn't raise an exception and it round-trips
through encoding/decoding. And you would have a somewhat more useful subset
--
On Thu, Jan 23, 2014 at 4:02 PM, Oscar Benjamin
oscar.j.benja...@gmail.comwrote:
On 23 January 2014 21:51, Chris Barker chris.bar...@noaa.gov wrote:
However, I would prefer latin-1 -- that way you might get garbage for
the
non-ascii parts, but it wouldn't raise an exception and it
On Thu, Jan 23, 2014 at 3:56 PM, josef.p...@gmail.com wrote:
I'm not sure anymore, after all these threads I think bytes should be
bytes and strings should be strings
exactly -- that's the py3 model, and I think we really soudl try to conform
to it, it's really the only way to have a robust
On 24 January 2014 01:09, Chris Barker chris.bar...@noaa.gov wrote:
On Thu, Jan 23, 2014 at 4:02 PM, Oscar Benjamin oscar.j.benja...@gmail.com
wrote:
On 23 January 2014 21:51, Chris Barker chris.bar...@noaa.gov wrote:
However, I would prefer latin-1 -- that way you might get garbage for
25 matches
Mail list logo