On Tue, Aug 29, 2006 at 03:46:45PM -0700, Mathew Yeates wrote:
> My head is about to explode.
> 
> I have an M by N array of floats. Associated with the columns are 
> character labels
> ['a','b','b','c','d','e','e','e']  note: already sorted so duplicates 
> are contiguous
> 
> I want to replace the 2 'b' columns with the sum of the 2 columns. 
> Similarly, replace the 3 'e' columns with the sum of the 3 'e' columns.
> 
> The resulting array still has M rows but less than N columns. Anyone? 
> Could be any harder than Sudoku.

I attach one possible solution (allowing for the same column name
occurring in different places, i.e. ['a','b','b','a']).  I'd be glad
for any suggestions on how to clean up the code.

Regards
Stéfan
import numpy as N
import itertools

x = N.ones((4,6))
fields = ['a','a','b','b','b','a']

fields_ = []
field_lengths = []
for k,g in itertools.groupby(fields):
    fields_.append(k)    
    field_lengths.append(len(list(g)))

indices = N.cumsum([0] + list(field_lengths))

oshape = list(x.shape)
oshape[1] = len(indices)-1
y = N.empty(oshape,dtype=x.dtype)

for i,s in enumerate(itertools.imap(slice,indices[:-1],indices[1:])):
    y[:,i] = x[:,s].sum(axis=1)

print 'Input:'
print '------'
print fields
print x
print
print 'Output:'
print '-------'
print fields_
print y
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Reply via email to