On Tue, Aug 29, 2006 at 03:46:45PM -0700, Mathew Yeates wrote:
> My head is about to explode.
>
> I have an M by N array of floats. Associated with the columns are
> character labels
> ['a','b','b','c','d','e','e','e'] note: already sorted so duplicates
> are contiguous
>
> I want to replace the 2 'b' columns with the sum of the 2 columns.
> Similarly, replace the 3 'e' columns with the sum of the 3 'e' columns.
>
> The resulting array still has M rows but less than N columns. Anyone?
> Could be any harder than Sudoku.
I attach one possible solution (allowing for the same column name
occurring in different places, i.e. ['a','b','b','a']). I'd be glad
for any suggestions on how to clean up the code.
Regards
Stéfan
import numpy as N
import itertools
x = N.ones((4,6))
fields = ['a','a','b','b','b','a']
fields_ = []
field_lengths = []
for k,g in itertools.groupby(fields):
fields_.append(k)
field_lengths.append(len(list(g)))
indices = N.cumsum([0] + list(field_lengths))
oshape = list(x.shape)
oshape[1] = len(indices)-1
y = N.empty(oshape,dtype=x.dtype)
for i,s in enumerate(itertools.imap(slice,indices[:-1],indices[1:])):
y[:,i] = x[:,s].sum(axis=1)
print 'Input:'
print '------'
print fields
print x
print
print 'Output:'
print '-------'
print fields_
print y
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion