I was trying out SeqAccessSparseVector on Canopy Clustering using Manhattan distance. I found performance to be really bad. So I profiled it with Yourkit(Thanks a lot for providing us free license)
Since i was trying out manhattan distance, there were a lot of A-B which created a lot of clone operation 5% of the total time there were also so many A+B for adding a point to the canopy to average. this was also creating a lot of clone operations. 90% of the total time So we definitely needs to improve that.. For a small hack. I made the cluster centers RandomAccess Vector. Things are fast again. I dont know whether to commit or not. But something to look into in 0.4? Robin