Re: [Numpy-discussion] Python3, genfromtxt and unicode

Antony Lee Tue, 01 May 2012 12:24:49 -0700

Sure, I will.  Right now my solution is to use genfromtxt once with bytes
and auto-dtype detection, then modify the resulting dtype, replacing bytes
with unicodes, and use that new dtypes for a second round of genfromtxt.  A
bit awkward but that gets the job done.
Antony Lee


2012/5/1 Charles R Harris <charlesr.har...@gmail.com>

>
>
> On Fri, Apr 27, 2012 at 8:17 PM, Antony Lee <antony....@berkeley.edu>wrote:
>
>> With bytes fields, genfromtxt(dtype=None) sets the sizes of the fields to
>> the largest number of chars (npyio.py line 1596), but it doesn't do the
>> same for unicode fields, which is a pity.  See example below.
>> I tried to change npyio.py around line 1600 to add that but it didn't
>> work; from my limited understanding the problem comes earlier, in the way
>> StringBuilder is defined(?).
>> Antony Lee
>>
>> import io, numpy as np
>> s = io.BytesIO()
>> s.write(b"abc 1\ndef 2")
>> s.seek(0)
>> t = np.genfromtxt(s, dtype=None) # (or converters={0: bytes})
>> print(t, t.dtype) # -> [(b'a', 1) (b'b', 2)] [('f0', '|S1'), ('f1',
>> '<i8')]
>> s.seek(0)
>> t = np.genfromtxt(s, dtype=None, converters={0: lambda s:
>> s.decode("utf-8")})
>> print(t, t.dtype) # -> [('', 1) ('', 2)] [('f0', '<U0'), ('f1', '<i8')]
>>
>>
> Could you open a ticket for this?
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Python3, genfromtxt and unicode

Reply via email to