I looked into DenseVector and it doesn't use any primitive collections, so ignore my last mail :)
On 12.03.2013 22:16, Sebastian Schelter wrote: > As a sidenote: I was kinda shocked recently, that switching from > DenseVector's dot() method to a direct dot product computation gave a 3x > increase in performance in > org.apache.mahout.cf.taste.hadoop.als.RecommenderJob. > > It seems like we really have a performance problem for some usecases. > > On 12.03.2013 22:04, Dawid Weiss wrote: >>> The primary use case for mahout collections is directly *inside* of >>> our Vector interface. Which is to say, it's not directly exposed to >>> most users, and we don't really expose the ability to do guava collections >>> stuff on them at all: We Do Math. :) So in particular, we don't expose >> >> Fair enough. But you might want to expose some of it at some point and >> if this happens it >> may just be ready for you. >> >>> Question is whether there's anything to be gained by just swapping >>> our own collections *out* for something else, like HPPC or fastutil. >> >> Depends. Speed optimizations may be one reason -- you'd need to check >> if the code gains anything by using these libraries compared to Mahout >> collections. While microbenchmarks may show large differences my bet >> is that overall results, taking into account >> computations and, God forbid, I/O, will be within noise range unless >> you're really using these data structures a *lot* in tight loops. The >> only practical benefit I see is getting rid of a chunk of code you >> don't wish to >> maintain (like you said: missing features, unit tests, etc.). But I >> don't negate there is some entertainment value in going back to such >> fundamental data structures and trying to squeeze the last bit of >> performance out of them. :) >> >> Dawid >> >
