On Sun, Feb 22, 2015 at 7:29 PM, Sturla Molden <sturla.mol...@gmail.com>
wrote:
>
> On 22/02/15 19:21, Aldcroft, Thomas wrote:
>
> > Problems like this are now showing up in the wild [3].  Workarounds are
> > also showing up, like a way to easily convert from 'S' to 'U' within
> > astropy Tables [4], but this is really not a desirable way to go.
> > Gigabyte-sized string data arrays are not uncommon, so converting to
> > UCS-4 is a real memory and performance hit.
>
> Why UCS-4? The Python's internal "flexible string respresentation" will
> use ascii for ascii text.

numpy's 'U' dtype is UCS-4, and this is what Thomas is referring to, not
Python's string type. It cannot have a flexible representation as it *is*
the representation. Python 3's `str` type is opaque, so it can freely
choose how to represent the data in memory. numpy dtypes transparently
describe how the data is represented in memory.

--
Robert Kern
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to