Re: [Numpy-discussion] Status of NumPy and Python 3.3

2012-08-03 Thread Ondřej Čertík
On Mon, Jul 30, 2012 at 5:00 PM, Ronan Lamy ronan.l...@gmail.com wrote:
 Le lundi 30 juillet 2012 à 11:07 -0700, Ondřej Čertík a écrit :
 On Mon, Jul 30, 2012 at 10:04 AM, Ronan Lamy ronan.l...@gmail.com wrote:
  Le lundi 30 juillet 2012 à 17:10 +0100, Ronan Lamy a écrit :
  Le lundi 30 juillet 2012 à 04:57 +0100, Ronan Lamy a écrit :
   Le lundi 30 juillet 2012 à 02:00 +0100, Ronan Lamy a écrit :
  
   
Anyway, I managed to compile (by blanking
numpy/distutils/command/__init__.py) and to run the tests. I only see
the 2 pickle errors from your latest gist. So that's all good!
  
   And the cause of these errors is that running the test suite somehow
   corrupts Python's internal cache of bytes objects, causing the
   following:
b'\x01XXX'[0:1]
   b'\xbb'
 
  The culprit is test_pickle_string_overwrite() in test_regression.py. The
  test actually tries to check for that kind of problem, but on Python 3,
  it only manages to trigger it without detecting it. Here's a simple way
  to reproduce the issue:
 
   a = numpy.array([1], 'b')
   b = pickle.loads(pickle.dumps(a))
   b[0] = 77
   b'\x01  '[0:1]
  b'M'
 
  Actually, this problem is probably quite old: I can see it in 1.6.1 w/
  Python 3.2.3. 3.3 only makes it more visible.
 
  I'll open an issue on GitHub ASAP.
 
  https://github.com/numpy/numpy/issues/370

 Thanks Ronan, nice work!

 Since you looked into this -- do you know a way to fix this? (Both
 NumPy and the test.)

 Pauli found out how to fix the code, so I'll try to send a PR tonight.


So this PR is now in and the issue is fixed.

As far as swapping the unicode issues, I finally understand what is
going on and I posted my current understanding into the Python tracker
issue (http://bugs.python.org/issue15540) which was recently created
for this same issue:

http://bugs.python.org/msg167280

but it was determined that it is not a bug in Python so it is closed
now. Finally, I have submitted a reworked version of my patch here:

https://github.com/numpy/numpy/pull/372

It implements things in a clean way.

Ondrej
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] 2 greatest values, in a 3-d array, along one axis

2012-08-03 Thread Jim Vickroy
Hello everyone,

I'm trying to determine the 2 greatest values, in a 3-d array, along one 
axis.

Here is an approach:

# --
# procedure to determine greatest 2 values for 3rd dimension of 3-d 
array ...
import numpy, numpy.ma
xcnt, ycnt, zcnt   = 2,3,4 # actual case is (1024, 1024, 8)
p0 = numpy.empty ((xcnt,ycnt,zcnt))
for z in range (zcnt) : p0[:,:,z] = z*z
zaxis  = 2# max 
values to be determined for 3rd axis
p0max  = numpy.max (p0, axis=zaxis)   # max 
values for zaxis
maxindices = numpy.argmax (p0, axis=zaxis)# 
indices of max values
p1 = p0.copy()# work 
array to scan for 2nd highest values
j, i   = numpy.meshgrid (numpy.arange (ycnt), numpy.arange 
(xcnt))
p1[i,j,maxindices] = numpy.NaN# flag 
all max values
p1 = numpy.ma.masked_where (numpy.isnan (p1), p1) # hide 
all max values
p1max  = numpy.max (p1, axis=zaxis)   # 2nd 
highest values for zaxis
# additional code to analyze p0max and p1max goes here
# --

I would appreciate feedback on a simpler approach -- e.g., one that does 
not require masked arrays and or use of magic values like NaN.

Thanks,
-- jv
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 2 greatest values, in a 3-d array, along one axis

2012-08-03 Thread Daπid
Here goes a 1D simple implementation. It shouldn't be difficult to
generalize to more dimensions, as all the functions support axis
argument:

 a=np.array([1, 2, 3, 5, 2])
 a.max()  # This is the maximum value
5
 mask=np.zeros_like(a)
 mask[np.argmax(a)]=1
 a=np.ma.masked_array(a, mask=mask)
 a.max()  # Second maximum value
3

I am using a masked array, so the structure of the array remains (ie,
you can still use it in multi-dimensional arrays). I could have
deleted de value, but then that wouldn't be useful for your case.

On Fri, Aug 3, 2012 at 4:18 PM, Jim Vickroy jim.vick...@noaa.gov wrote:
 Hello everyone,

 I'm trying to determine the 2 greatest values, in a 3-d array, along one
 axis.

 Here is an approach:

 # --
 # procedure to determine greatest 2 values for 3rd dimension of 3-d
 array ...
 import numpy, numpy.ma
 xcnt, ycnt, zcnt   = 2,3,4 # actual case is (1024, 1024, 8)
 p0 = numpy.empty ((xcnt,ycnt,zcnt))
 for z in range (zcnt) : p0[:,:,z] = z*z
 zaxis  = 2# max
 values to be determined for 3rd axis
 p0max  = numpy.max (p0, axis=zaxis)   # max
 values for zaxis
 maxindices = numpy.argmax (p0, axis=zaxis)#
 indices of max values
 p1 = p0.copy()# work
 array to scan for 2nd highest values
 j, i   = numpy.meshgrid (numpy.arange (ycnt), numpy.arange
 (xcnt))
 p1[i,j,maxindices] = numpy.NaN# flag
 all max values
 p1 = numpy.ma.masked_where (numpy.isnan (p1), p1) # hide
 all max values
 p1max  = numpy.max (p1, axis=zaxis)   # 2nd
 highest values for zaxis
 # additional code to analyze p0max and p1max goes here
 # --

 I would appreciate feedback on a simpler approach -- e.g., one that does
 not require masked arrays and or use of magic values like NaN.

 Thanks,
 -- jv
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 2 greatest values, in a 3-d array, along one axis

2012-08-03 Thread Daπid
Here is the 3D implementation:

http://pastebin.com/ERVLWhbS

I was only able to do it using a double nested loop, but I am sure
someone more clever than me can do a slicing trick to overcome it.
Otherwise, I hope this is fast enough for your purpose.


David.

On Fri, Aug 3, 2012 at 4:41 PM, Daπid davidmen...@gmail.com wrote:
 Here goes a 1D simple implementation. It shouldn't be difficult to
 generalize to more dimensions, as all the functions support axis
 argument:

 a=np.array([1, 2, 3, 5, 2])
 a.max()  # This is the maximum value
 5
 mask=np.zeros_like(a)
 mask[np.argmax(a)]=1
 a=np.ma.masked_array(a, mask=mask)
 a.max()  # Second maximum value
 3

 I am using a masked array, so the structure of the array remains (ie,
 you can still use it in multi-dimensional arrays). I could have
 deleted de value, but then that wouldn't be useful for your case.

 On Fri, Aug 3, 2012 at 4:18 PM, Jim Vickroy jim.vick...@noaa.gov wrote:
 Hello everyone,

 I'm trying to determine the 2 greatest values, in a 3-d array, along one
 axis.

 Here is an approach:

 # --
 # procedure to determine greatest 2 values for 3rd dimension of 3-d
 array ...
 import numpy, numpy.ma
 xcnt, ycnt, zcnt   = 2,3,4 # actual case is (1024, 1024, 8)
 p0 = numpy.empty ((xcnt,ycnt,zcnt))
 for z in range (zcnt) : p0[:,:,z] = z*z
 zaxis  = 2# max
 values to be determined for 3rd axis
 p0max  = numpy.max (p0, axis=zaxis)   # max
 values for zaxis
 maxindices = numpy.argmax (p0, axis=zaxis)#
 indices of max values
 p1 = p0.copy()# work
 array to scan for 2nd highest values
 j, i   = numpy.meshgrid (numpy.arange (ycnt), numpy.arange
 (xcnt))
 p1[i,j,maxindices] = numpy.NaN# flag
 all max values
 p1 = numpy.ma.masked_where (numpy.isnan (p1), p1) # hide
 all max values
 p1max  = numpy.max (p1, axis=zaxis)   # 2nd
 highest values for zaxis
 # additional code to analyze p0max and p1max goes here
 # --

 I would appreciate feedback on a simpler approach -- e.g., one that does
 not require masked arrays and or use of magic values like NaN.

 Thanks,
 -- jv
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 2 greatest values, in a 3-d array, along one axis

2012-08-03 Thread Angus McMorland
On 3 August 2012 11:18, Jim Vickroy jim.vick...@noaa.gov wrote:

 Hello everyone,

 I'm trying to determine the 2 greatest values, in a 3-d array, along one
 axis.

 Here is an approach:

 # --
 # procedure to determine greatest 2 values for 3rd dimension of 3-d
 array ...
 import numpy, numpy.ma
 xcnt, ycnt, zcnt   = 2,3,4 # actual case is (1024, 1024, 8)
 p0 = numpy.empty ((xcnt,ycnt,zcnt))
 for z in range (zcnt) : p0[:,:,z] = z*z
 zaxis  = 2# max
 values to be determined for 3rd axis
 p0max  = numpy.max (p0, axis=zaxis)   # max
 values for zaxis
 maxindices = numpy.argmax (p0, axis=zaxis)#
 indices of max values
 p1 = p0.copy()# work
 array to scan for 2nd highest values
 j, i   = numpy.meshgrid (numpy.arange (ycnt), numpy.arange
 (xcnt))
 p1[i,j,maxindices] = numpy.NaN# flag
 all max values
 p1 = numpy.ma.masked_where (numpy.isnan (p1), p1) # hide
 all max values
 p1max  = numpy.max (p1, axis=zaxis)   # 2nd
 highest values for zaxis
 # additional code to analyze p0max and p1max goes here
 # --

 I would appreciate feedback on a simpler approach -- e.g., one that does
 not require masked arrays and or use of magic values like NaN.

 Thanks,
 -- jv


Here's a way that only uses argsort and fancy indexing:

a = np.random.randint(10, size=(3,3,3))
print a

[[[0 3 8]
  [4 2 8]
  [8 6 3]]

 [[0 6 7]
  [0 3 9]
  [0 9 1]]

 [[7 9 7]
  [5 2 9]
  [9 3 3]]]

am = a.argsort(axis=2)
maxs = a[np.arange(a.shape[0])[:,None], np.arange(a.shape[1])[None],
am[:,:,-1]]
print maxs

[[8 8 8]
 [7 9 9]
 [9 9 9]]

seconds = a[np.arange(a.shape[0])[:,None], np.arange(a.shape[1])[None],
am[:,:,-2]]
print seconds

[[3 4 6]
 [6 3 1]
 [7 5 3]]

And to double check:

i, j = 0, 1
l = a[i, j,:]
print l

[4 2 8]

print np.max(a[i,j,:]), maxs[i,j]

8 8

print l[np.argsort(l)][-2], second[i,j]

4 4

Good luck.

Angus.
-- 
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 2 greatest values, in a 3-d array, along one axis

2012-08-03 Thread Jim Vickroy
Thanks for each of the improved solutions.  The one using argsort took a 
little while for me to understand.  I have a long way to go to fully 
utilize fancy indexing!  -- jv



On 8/3/2012 10:02 AM, Angus McMorland wrote:
On 3 August 2012 11:18, Jim Vickroy jim.vick...@noaa.gov 
mailto:jim.vick...@noaa.gov wrote:


Hello everyone,

I'm trying to determine the 2 greatest values, in a 3-d array,
along one
axis.

Here is an approach:

# --
# procedure to determine greatest 2 values for 3rd dimension of 3-d
array ...
import numpy, numpy.ma http://numpy.ma
xcnt, ycnt, zcnt   = 2,3,4 # actual case is (1024, 1024, 8)
p0 = numpy.empty ((xcnt,ycnt,zcnt))
for z in range (zcnt) : p0[:,:,z] = z*z
zaxis  = 2  
 # max

values to be determined for 3rd axis
p0max  = numpy.max (p0, axis=zaxis)  
# max

values for zaxis
maxindices = numpy.argmax (p0, axis=zaxis)#
indices of max values
p1 = p0.copy()  
 # work

array to scan for 2nd highest values
j, i   = numpy.meshgrid (numpy.arange (ycnt), numpy.arange
(xcnt))
p1[i,j,maxindices] = numpy.NaN  
 # flag

all max values
p1 = numpy.ma.masked_where (numpy.isnan (p1), p1)
# hide
all max values
p1max  = numpy.max (p1, axis=zaxis)  
# 2nd

highest values for zaxis
# additional code to analyze p0max and p1max goes here
# --

I would appreciate feedback on a simpler approach -- e.g., one
that does
not require masked arrays and or use of magic values like NaN.

Thanks,
-- jv


Here's a way that only uses argsort and fancy indexing:

a = np.random.randint(10, size=(3,3,3))
print a

[[[0 3 8]
  [4 2 8]
  [8 6 3]]

 [[0 6 7]
  [0 3 9]
  [0 9 1]]

 [[7 9 7]
  [5 2 9]
  [9 3 3]]]

am = a.argsort(axis=2)
maxs = a[np.arange(a.shape[0])[:,None], 
np.arange(a.shape[1])[None], am[:,:,-1]]

print maxs

[[8 8 8]
 [7 9 9]
 [9 9 9]]

seconds = a[np.arange(a.shape[0])[:,None], 
np.arange(a.shape[1])[None], am[:,:,-2]]

print seconds

[[3 4 6]
 [6 3 1]
 [7 5 3]]

And to double check:

i, j = 0, 1
l = a[i, j,:]
print l

[4 2 8]

print np.max(a[i,j,:]), maxs[i,j]

8 8

print l[np.argsort(l)][-2], second[i,j]

4 4

Good luck.

Angus.
--
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Licensing question

2012-08-03 Thread Dag Sverre Seljebotn
On 08/02/2012 10:44 PM, Damon McDougall wrote:
 Hi,

 I have a question about the licence for NumPy's codebase. I am currently
 writing a library and I'd like to release under some BSD-type licence.
 Unfortunately, my choice to link against MIT's FFTW library (released
 under the GPL) means that, in its current state, this is not possible.
 I'm an avid NumPy user and thought to myself that, since NumPy's licence
 is BSD, I'd be able to use some of the source code (with due credit, of
 course) instead of FFTW. Is this possible? I mean, can I redistribute
 *PART* of NumPy's codebase? Namely, the fftpack.c file? I was under the
 impression that I could only redistribute BSD source code as a whole and
 then I read the licence more carefully and it states that I can modify
 the source to suit my needs. I consider 'redistributing a single file
 and ignoring the other files' as a 'modification' under the BSD
 definition, but maybe I'm thinking too wishfully here.

 Any information on this matter would be greatly appreciated since I am a
 total code licence noob.

 Thank you.

 P.S. Yes, I know I could just release under the GPL, but I don't want to
 turn people off of packaging my work into a useful product licensed
 under BSD, or even make money from it.

Not related to licensing, but here's another port of FFTPACK to C by 
Martin Reinecke, licensed under BSD. The README has the links to the 
original Fortran sources that this is based on.

https://github.com/dagss/libfftpack

Dag
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 2 greatest values, in a 3-d array, along one axis

2012-08-03 Thread eat
Hi,

On Fri, Aug 3, 2012 at 7:02 PM, Angus McMorland amcm...@gmail.com wrote:

 On 3 August 2012 11:18, Jim Vickroy jim.vick...@noaa.gov wrote:

 Hello everyone,

 I'm trying to determine the 2 greatest values, in a 3-d array, along one
 axis.

 Here is an approach:

 # --
 # procedure to determine greatest 2 values for 3rd dimension of 3-d
 array ...
 import numpy, numpy.ma
 xcnt, ycnt, zcnt   = 2,3,4 # actual case is (1024, 1024, 8)
 p0 = numpy.empty ((xcnt,ycnt,zcnt))
 for z in range (zcnt) : p0[:,:,z] = z*z
 zaxis  = 2# max
 values to be determined for 3rd axis
 p0max  = numpy.max (p0, axis=zaxis)   # max
 values for zaxis
 maxindices = numpy.argmax (p0, axis=zaxis)#
 indices of max values
 p1 = p0.copy()# work
 array to scan for 2nd highest values
 j, i   = numpy.meshgrid (numpy.arange (ycnt), numpy.arange
 (xcnt))
 p1[i,j,maxindices] = numpy.NaN# flag
 all max values
 p1 = numpy.ma.masked_where (numpy.isnan (p1), p1) # hide
 all max values
 p1max  = numpy.max (p1, axis=zaxis)   # 2nd
 highest values for zaxis
 # additional code to analyze p0max and p1max goes here
 # --

 I would appreciate feedback on a simpler approach -- e.g., one that does
 not require masked arrays and or use of magic values like NaN.

 Thanks,
 -- jv


 Here's a way that only uses argsort and fancy indexing:

 a = np.random.randint(10, size=(3,3,3))
 print a

 [[[0 3 8]
   [4 2 8]
   [8 6 3]]

  [[0 6 7]
   [0 3 9]
   [0 9 1]]

  [[7 9 7]
   [5 2 9]
   [9 3 3]]]

 am = a.argsort(axis=2)
 maxs = a[np.arange(a.shape[0])[:,None], np.arange(a.shape[1])[None],
 am[:,:,-1]]
 print maxs

 [[8 8 8]
  [7 9 9]
  [9 9 9]]

 seconds = a[np.arange(a.shape[0])[:,None], np.arange(a.shape[1])[None],
 am[:,:,-2]]
 print seconds

 [[3 4 6]
  [6 3 1]
  [7 5 3]]

 And to double check:

 i, j = 0, 1
 l = a[i, j,:]
 print l

 [4 2 8]

  print np.max(a[i,j,:]), maxs[i,j]

 8 8

 print l[np.argsort(l)][-2], second[i,j]

 4 4

 Good luck.

Here the np.indicies function may help a little bit, like:
In []: a= randint(10, size= (3, 2, 4))
In []: a
Out[]:
array([[[1, 9, 6, 6],
[0, 3, 4, 2]],
   [[4, 2, 4, 4],
[5, 9, 4, 4]],
   [[6, 1, 4, 3],
[5, 4, 5, 5]]])

In []: ndx= indices(a.shape)
In []: # largest
In []: a[a.argsort(0), ndx[1], ndx[2]][-1]
Out[]:
array([[6, 9, 6, 6],
   [5, 9, 5, 5]])

In []: # second largest
In []: a[a.argsort(0), ndx[1], ndx[2]][-2]
Out[]:
array([[4, 2, 4, 4],
   [5, 4, 4, 4]])


My 2 cents,
-eat


 Angus.
 --
 AJC McMorland
 Post-doctoral research fellow
 Neurobiology, University of Pittsburgh

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Unicode revisited

2012-08-03 Thread Travis Oliphant
Hey all, 

Ondrej has been working hard with feedback from many others on improving 
Unicode support in NumPy (especially for Python 3.3).   Looking at what Python 
has done in Python 3.3 (PEP 393) and chatting on the Python issue tracker with 
the author of that PEP has made me wonder if we aren't doing the wrong thing 
in NumPy quite often. 

Basically, NumPy only supports UTF-32 in it's Unicode representation.   All 
bytes in NumPy arrays should be either UTF-32LE or UTF-32BE.This is all 
pretty easy to understand as long as you stick with NumPy arrays only. 

The difficulty starts when you start to interact with the unicode array scalar 
(which is the same data-structure exactly as a Python unicode object with a 
different type-name --- numpy.unicode_).However, I overlooked the 
encoding argument to the standard unicode constructor which might have 
simplified what we are doing.If I understand things correctly, now, all we 
need to do is to decode the UTF-32LE or UTF-32BE raw bytes in the array 
(depending on the dtype) into a unicode object. 

This is easily accomplished with  numpy.unicode_(bytes object, 'utf_32_be'  
or 'utf_32_le').There is also an encoding equivalent to go from the 
Python unicode object to the bytes representation in the NumPy array.   I think 
this is what we should be doing in most of the places and it should 
considerably simplify the Unicode code in NumPy --- eliminating possibly the 
ucsnarrow.c file. 

Am I missing something? 

Thanks, 

-Travis


 
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Unicode revisited

2012-08-03 Thread Charles R Harris
On Fri, Aug 3, 2012 at 7:03 PM, Travis Oliphant tra...@continuum.io wrote:

 Hey all,

 Ondrej has been working hard with feedback from many others on improving
 Unicode support in NumPy (especially for Python 3.3).   Looking at what
 Python has done in Python 3.3 (PEP 393) and chatting on the Python issue
 tracker with the author of that PEP has made me wonder if we aren't doing
 the wrong thing in NumPy quite often.

 Basically, NumPy only supports UTF-32 in it's Unicode representation.
 All bytes in NumPy arrays should be either UTF-32LE or UTF-32BE.This is
 all pretty easy to understand as long as you stick with NumPy arrays only.

 The difficulty starts when you start to interact with the unicode array
 scalar (which is the same data-structure exactly as a Python unicode object
 with a different type-name --- numpy.unicode_).However, I overlooked
 the encoding argument to the standard unicode constructor which might
 have simplified what we are doing.If I understand things correctly,
 now, all we need to do is to decode the UTF-32LE or UTF-32BE raw bytes in
 the array (depending on the dtype) into a unicode object.

 This is easily accomplished with  numpy.unicode_(bytes object,
 'utf_32_be'  or 'utf_32_le').There is also an encoding equivalent to
 go from the Python unicode object to the bytes representation in the NumPy
 array.   I think this is what we should be doing in most of the places and
 it should considerably simplify the Unicode code in NumPy --- eliminating
 possibly the ucsnarrow.c file.

 Am I missing something?


I can't comment on the rest, but I'd be happy to see the end of the
ucsnarrow.c file. It needs more work to be properly generalized and if
there is a way to avoid that, so much the better.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Unicode revisited

2012-08-03 Thread Ondřej Čertík
On Fri, Aug 3, 2012 at 6:03 PM, Travis Oliphant tra...@continuum.io wrote:
 Hey all,

 Ondrej has been working hard with feedback from many others on improving 
 Unicode support in NumPy (especially for Python 3.3).   Looking at what 
 Python has done in Python 3.3 (PEP 393) and chatting on the Python issue 
 tracker with the author of that PEP has made me wonder if we aren't doing 
 the wrong thing in NumPy quite often.

 Basically, NumPy only supports UTF-32 in it's Unicode representation.   All 
 bytes in NumPy arrays should be either UTF-32LE or UTF-32BE.This is all 
 pretty easy to understand as long as you stick with NumPy arrays only.

 The difficulty starts when you start to interact with the unicode array 
 scalar (which is the same data-structure exactly as a Python unicode object 
 with a different type-name --- numpy.unicode_).However, I overlooked the 
 encoding argument to the standard unicode constructor which might have 
 simplified what we are doing.If I understand things correctly, now, all 
 we need to do is to decode the UTF-32LE or UTF-32BE raw bytes in the array 
 (depending on the dtype) into a unicode object.

 This is easily accomplished with  numpy.unicode_(bytes object, 'utf_32_be'  
 or 'utf_32_le').There is also an encoding equivalent to go from the 
 Python unicode object to the bytes representation in the NumPy array.   I 
 think this is what we should be doing in most of the places and it should 
 considerably simplify the Unicode code in NumPy --- eliminating possibly the 
 ucsnarrow.c file.

 Am I missing something?

I guess we'll try and see. :)

Would it make sense to merge https://github.com/numpy/numpy/pull/372
now, because it will make NumPy working in Python 3.3 (and it seems to
me that the implementation is reasonable)? And then I'll work on
trying to use your new approach, both for 2.7 and 3.2 and 3.3.

Ondrej
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion