The Mahout Vector implementations of arithmetic have what I think is a bug.
class AbstractVector
addTo(Vector v) {
Iterator<Element> it = iterateNonZero();
while (it.hasNext()) {
Element e = it.next();
int index = e.index();
v.setQuick(index, v.getQuick(index) + e.get());
}
}
Because "this" walks only its non-zero elements, matching columns in the
other vector are ignored. That is:
[1, 2, 0, 4].addTo([1, 1, 1, 1]) = [2, 3, 0, 5]
public Vector plus(Vector) also does this at around line 371:
Vector result = like().assign(this);
Iterator<Element> iter = x.iterateNonZero();
while (iter.hasNext()) {
Element e = iter.next();
int index = e.index();
result.setQuick(index, this.getQuick(index) + e.get());
}
return result;
All of the Vector subclasses that store data (Dense, RandomAccess,
SequentialAccess) don't override these two methods.
The unit tests don't catch this mistake- they need a wider range of test
data. A lot of code uses these two methods, and are getting bogus
results. Because Maven is weird for me, I can't run the test suites on a
fixed version.
Lance Norskog
.