The Mahout Vector implementations of arithmetic have what I think is a bug.

class AbstractVector
addTo(Vector v) {
    Iterator<Element> it = iterateNonZero();
    while (it.hasNext()) {
      Element e = it.next();
      int index = e.index();
      v.setQuick(index, v.getQuick(index) + e.get());
    }
  }

Because "this" walks only its non-zero elements, matching columns in the other vector are ignored. That is:

[1, 2, 0, 4].addTo([1, 1, 1, 1]) = [2,  3, 0, 5]

public Vector plus(Vector) also does this at around line 371:

    Vector result = like().assign(this);
    Iterator<Element> iter = x.iterateNonZero();
    while (iter.hasNext()) {
      Element e = iter.next();
      int index = e.index();
      result.setQuick(index, this.getQuick(index) + e.get());
    }
    return result;


All of the Vector subclasses that store data (Dense, RandomAccess, SequentialAccess) don't override these two methods. The unit tests don't catch this mistake- they need a wider range of test data. A lot of code uses these two methods, and are getting bogus results. Because Maven is weird for me, I can't run the test suites on a fixed version.

Lance Norskog

.














Reply via email to