Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-09 Thread Francesc Alted
A Sunday 06 December 2009 11:47:23 Francesc Alted escrigué: A Saturday 05 December 2009 11:16:55 Dag Sverre Seljebotn escrigué: In [19]: t = np.dtype(i4,f4) In [20]: t Out[20]: dtype([('f0', 'i4'), ('f1', 'f4')]) In [21]: hash(t) Out[21]: -9041335829180134223 In [22]:

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-06 Thread Francesc Alted
A Saturday 05 December 2009 11:16:55 Dag Sverre Seljebotn escrigué: Mmh, the only case that I'm aware about dtype *mutability* is changing the names of compound types: In [19]: t = np.dtype(i4,f4) In [20]: t Out[20]: dtype([('f0', 'i4'), ('f1', 'f4')]) In [21]: hash(t) Out[21]:

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-05 Thread Dag Sverre Seljebotn
Francesc Alted wrote: A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigué: Pauli Virtanen wrote: Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: [clip] Great! Are you storing the format string in the dtype types as well? (So that no release is needed and

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-05 Thread David Cournapeau
On Sat, Dec 5, 2009 at 7:16 PM, Dag Sverre Seljebotn da...@student.matnat.uio.no wrote: Perhaps this should be marked as a bug?  I'm not sure about that, because the above seems quite useful. Well, I for one don't like this, but that's just an opinion. I think it is unwise to leave object

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-04 Thread Francesc Alted
A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigué: Pauli Virtanen wrote: Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: [clip] Great! Are you storing the format string in the dtype types as well? (So that no release is needed and acquisitions are cheap...)

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-04 Thread David Cournapeau
On Fri, Dec 4, 2009 at 9:23 PM, Francesc Alted fal...@pytables.org wrote: A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigué: Pauli Virtanen wrote: Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: [clip] Great! Are you storing the format string in the dtype

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-04 Thread Bruce Southey
On 12/04/2009 10:12 AM, David Cournapeau wrote: On Fri, Dec 4, 2009 at 9:23 PM, Francesc Altedfal...@pytables.org wrote: A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigué: Pauli Virtanen wrote: Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote:

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-04 Thread David Cournapeau
On Sat, Dec 5, 2009 at 1:31 AM, Bruce Southey bsout...@gmail.com wrote: On 12/04/2009 10:12 AM, David Cournapeau wrote: On Fri, Dec 4, 2009 at 9:23 PM, Francesc Altedfal...@pytables.org  wrote: A Thursday 03 December 2009 14:56:16 Dag Sverre Seljebotn escrigué: Pauli Virtanen wrote: Thu,

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-04 Thread Francesc Alted
A Friday 04 December 2009 17:12:09 David Cournapeau escrigué: Mmh, the only case that I'm aware about dtype *mutability* is changing the names of compound types: In [19]: t = np.dtype(i4,f4) In [20]: t Out[20]: dtype([('f0', 'i4'), ('f1', 'f4')]) In [21]: hash(t) Out[21]:

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-04 Thread David Cournapeau
On Sat, Dec 5, 2009 at 1:57 AM, David Cournapeau courn...@gmail.com wrote: On Sat, Dec 5, 2009 at 1:31 AM, Bruce Southey bsout...@gmail.com wrote: On 12/04/2009 10:12 AM, David Cournapeau wrote: On Fri, Dec 4, 2009 at 9:23 PM, Francesc Altedfal...@pytables.org  wrote: A Thursday 03 December

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-04 Thread Bruce Southey
On 12/04/2009 10:57 AM, David Cournapeau wrote: On Sat, Dec 5, 2009 at 1:31 AM, Bruce Southeybsout...@gmail.com wrote: On 12/04/2009 10:12 AM, David Cournapeau wrote: On Fri, Dec 4, 2009 at 9:23 PM, Francesc Altedfal...@pytables.org wrote: A Thursday 03 December

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-03 Thread Dag Sverre Seljebotn
Dag Sverre Seljebotn wrote: Dag Sverre Seljebotn wrote: Pauli Virtanen wrote: Fri, 27 Nov 2009 23:19:58 +0100, Dag Sverre Seljebotn wrote: [clip] One thing to keep in mind here is that PEP 3118 actually defines a standard dtype format string, which is (mostly)

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-03 Thread Pauli Virtanen
Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: [clip] Great! Are you storing the format string in the dtype types as well? (So that no release is needed and acquisitions are cheap...) I regenerate it on each buffer acquisition. It's simple low-level C code, and I suspect it will

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-12-03 Thread Dag Sverre Seljebotn
Pauli Virtanen wrote: Thu, 03 Dec 2009 14:03:13 +0100, Dag Sverre Seljebotn wrote: [clip] Great! Are you storing the format string in the dtype types as well? (So that no release is needed and acquisitions are cheap...) I regenerate it on each buffer acquisition. It's simple

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Pauli Virtanen
to, 2009-11-26 kello 17:37 -0700, Charles R Harris kirjoitti: [clip] I'm not clear on your recommendation here, is it that we should use bytes, with unicode converted to UTF8? The point is that I don't think we can just decide to use Unicode or Bytes in all places where PyString was used

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread David Cournapeau
Pauli Virtanen wrote: By the way, should I commit this stuff (after factoring the commits to logical chunks) to SVN? I would prefer getting at least one py3 buildbot before doing anything significant, cheers, David ___ NumPy-Discussion mailing

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Pauli Virtanen
pe, 2009-11-27 kello 18:30 +0900, David Cournapeau kirjoitti: Pauli Virtanen wrote: By the way, should I commit this stuff (after factoring the commits to logical chunks) to SVN? I would prefer getting at least one py3 buildbot before doing anything significant, I can add it to mine:

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Francesc Alted
A Friday 27 November 2009 10:47:53 Pauli Virtanen escrigué: 1) For 'S' dtype, I believe we use Bytes for the raw data and the interface. Maybe we want to introduce a separate bytes dtype that's an alias for 'S'? Yeah. As regular strings in Python 3 are Unicode, I think that

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Pauli Virtanen
pe, 2009-11-27 kello 11:17 +0100, Francesc Alted kirjoitti: A Friday 27 November 2009 10:47:53 Pauli Virtanen escrigué: 1) For 'S' dtype, I believe we use Bytes for the raw data and the interface. Maybe we want to introduce a separate bytes dtype that's an alias for 'S'?

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Francesc Alted
A Friday 27 November 2009 11:27:00 Pauli Virtanen escrigué: Yes. But now I wonder, should array(['foo'], str) array(['foo']) be of dtype 'S' or 'U' in Python 3? I think I'm leaning towards 'U', which will mean unavoidable code breakage -- there's probably no avoiding it.

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread René Dudfield
On Fri, Nov 27, 2009 at 11:50 AM, Francesc Alted fal...@pytables.org wrote: A Friday 27 November 2009 11:27:00 Pauli Virtanen escrigué: Yes. But now I wonder, should       array(['foo'], str)       array(['foo']) be of dtype 'S' or 'U' in Python 3? I think I'm leaning towards 'U', which

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Pauli Virtanen
pe, 2009-11-27 kello 13:23 +0100, René Dudfield kirjoitti: [clip] I imagine dtype 'S' and 'U' need more clarification. As it misses the concept of encodings it seems? Currently, S appears to mean 8bit characters no encoding, and U appears to mean 16bit characters no encoding? Or are some

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Francesc Alted
A Friday 27 November 2009 13:23:10 René Dudfield escrigué: I don't think they are internally UTF-8: http://docs.python.org/3.1/c-api/unicode.html Python’s default builds use a 16-bit type for Py_UNICODE and store Unicode values internally as UCS2. Ah! No changes for that matter.

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread René Dudfield
On Fri, Nov 27, 2009 at 1:41 PM, Pauli Virtanen p...@iki.fi wrote: 2to3/3to2 fixers will probably have to be written for users code here... whatever is decided.  At least warnings should be generated I'm guessing. Possibly. Does 2to3 support plugins? If yes, it could be possible to write

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread René Dudfield
On Fri, Nov 27, 2009 at 1:49 PM, Francesc Alted fal...@pytables.org wrote: Correct.  But, in addition, we are going to need a new 'bytes' dtype for NumPy for Python 3, right? I think so. However, I think S is probably closest to bytes... and maybe S can be reused for bytes... I'm not sure

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread René Dudfield
On Fri, Nov 27, 2009 at 3:07 PM, René Dudfield ren...@gmail.com wrote: hey, yeah I definitely would :)   I don't have much time for the next week or so though. btw, feel free to just copy whatever you like from there into your tree. cheers, ___

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Francesc Alted
A Friday 27 November 2009 15:09:00 René Dudfield escrigué: On Fri, Nov 27, 2009 at 1:49 PM, Francesc Alted fal...@pytables.org wrote: Correct. But, in addition, we are going to need a new 'bytes' dtype for NumPy for Python 3, right? I think so. However, I think S is probably closest to

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Pauli Virtanen
pe, 2009-11-27 kello 16:33 +0100, Francesc Alted kirjoitti: A Friday 27 November 2009 15:09:00 René Dudfield escrigué: On Fri, Nov 27, 2009 at 1:49 PM, Francesc Alted fal...@pytables.org wrote: Correct. But, in addition, we are going to need a new 'bytes' dtype for NumPy for Python 3,

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Francesc Alted
A Friday 27 November 2009 16:41:04 Pauli Virtanen escrigué: I think so. However, I think S is probably closest to bytes... and maybe S can be reused for bytes... I'm not sure though. That could be a good idea because that would ensure compatibility with existing NumPy scripts (i.e.

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Christopher Barker
The point is that I don't think we can just decide to use Unicode or Bytes in all places where PyString was used earlier. Agreed. I think it's helpful to remember the origins of all this: IMHO, there are two distinct types of data that Python2 strings support: 1) text: this is the

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Pauli Virtanen
pe, 2009-11-27 kello 10:36 -0800, Christopher Barker kirjoitti: [clip] Which one it will be should depend on the use. Users will expect that eg. array([1,2,3], dtype='f4') still works, and they don't have to do e.g. array([1,2,3], dtype=b'f4'). Personally, I try to use np.float32

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Anne Archibald
2009/11/27 Christopher Barker chris.bar...@noaa.gov: The point is that I don't think we can just decide to use Unicode or Bytes in all places where PyString was used earlier. Agreed. I only half agree. It seems to me that for almost all situations where PyString was used, the right data type

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Christopher Barker
Anne Archibald wrote: I don't think it makes sense to handle format strings in Unicode internally -- they should always be coerced to bytes. This should be fine -- we control what is a valid format string, and thus they can always be ASCII-safe. I have to disagree. Why should we force the

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-27 Thread Dag Sverre Seljebotn
Francesc Alted wrote: A Friday 27 November 2009 16:41:04 Pauli Virtanen escrigué: I think so. However, I think S is probably closest to bytes... and maybe S can be reused for bytes... I'm not sure though. That could be a good idea because that would ensure compatibility with existing NumPy

[Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-26 Thread Pauli Virtanen
Hi, The Python 3 porting needs some decisions on what is Bytes and what is Unicode. I'm currently taking the following approach. Comments? *** dtype field names Either Bytes or Unicode. But 'a' and b'a' are *different* fields. The issue is that:

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-26 Thread Charles R Harris
Hi Pauli, On Thu, Nov 26, 2009 at 4:08 PM, Pauli Virtanen p...@iki.fi wrote: Hi, The Python 3 porting needs some decisions on what is Bytes and what is Unicode. I'm currently taking the following approach. Comments? *** dtype field names Either Bytes or Unicode.

Re: [Numpy-discussion] Bytes vs. Unicode in Python3

2009-11-26 Thread René Dudfield
On Fri, Nov 27, 2009 at 1:37 AM, Charles R Harris charlesr.har...@gmail.com wrote: Hi Pauli, On Thu, Nov 26, 2009 at 4:08 PM, Pauli Virtanen p...@iki.fi wrote: Hi, The Python 3 porting needs some decisions on what is Bytes and what is Unicode. I'm currently taking the following approach.

Re: [Numpy-discussion] .bytes

2007-10-15 Thread Yves Revaz
Nadav Horesh wrote: array(1, dtype=float32).itemsize ok, it will work fine for my purpose. In numpy, is there any reason to supress the attribute .bytes from the type object itself ? Is it simply because the native python types (int, float, complex, etc.) do not have this attribute ?

Re: [Numpy-discussion] .bytes

2007-10-15 Thread Robert Kern
Yves Revaz wrote: Nadav Horesh wrote: array(1, dtype=float32).itemsize ok, it will work fine for my purpose. In numpy, is there any reason to supress the attribute .bytes from the type object itself ? Is it simply because the native python types (int, float, complex, etc.) do not have

[Numpy-discussion] .bytes

2007-10-14 Thread Yves Revaz
Dear list, I'm translating codes from numarray to numpy. Unfortunately, I'm unable to find the equivalent of the command that give the number of bytes for a given type : using numarray I used : Float32.bytes 4 I'm sure there is a solution in numpy, but I'm unable to find it. Thanks, Yves

Re: [Numpy-discussion] .bytes

2007-10-14 Thread Matthieu Brucher
Hi, In the description field, you have itemsize which is what you want. Matthieu 2007/10/14, Yves Revaz [EMAIL PROTECTED]: Dear list, I'm translating codes from numarray to numpy. Unfortunately, I'm unable to find the equivalent of the command that give the number of bytes for a given