On Mon, Jun 15, 2009 at 4:17 PM, Jeff Eastman<j...@windwardsolutions.com> wrote: > The difference is getQuick does not check bounds.
This does make sense for practical reasons. From an API perspective, seems like it's slightly undesirable to have two of these methods, one of which exists just to bypass the checks, as it might lead to silent bugs. I see the purpose: performance. but in that case, as I am getting to in MAHOUT-121, perhaps it is better to attack that problem in a different way? for instance, if most of the setting is done in vector construction, and I imagine it is, then we might want instead some kind of bulk-set method. > IIRC, clone of a SparseVector would share the map. Not desirable IMHO. The default implementation of clone() produces a shallow copy, indeed, but that is why it may be overridden. It seems slightly better to just call this method clone() and implement Cloneable to fit into the standard APIs. > Cardinality is the actual dimension and size is the current number of > elements. It only differs from cardinality in the case of SparseVector, in > which case size returns the number of non-zero elements. Got it. Is size() a useful value to expose? seems like an implementation detail. What would a caller, in general, do with that? You could say it can't hurt to expose, but for instance, the first use of it that came up in my IDE appears to be a bug (could be wrong) in UncommonsDistributions: public static Vector rDirichlet(Vector alpha) { Vector r = alpha.like(); double total = alpha.zSum(); double remainder = 1; for (int i = 0; i < r.size(); i++) { double a = alpha.get(i); total -= a; double beta = rBeta(a, Math.max(0, total)); double p = beta * remainder; r.set(i, p); remainder -= p; } return r; } Is cardinality intended?