This looks very wrong.  The iterators for SASV extend guava's
AbstractIterator, but they do reuse the NonDefaultElement instance
internally.   It *looks* like we're correctly satisfying the
AbstractIterator#computeNext() contract, but we must not be if we're
mutating on multiple hasNext() calls...



On Fri, Apr 12, 2013 at 9:36 AM, Dan Filimon <[email protected]>wrote:

> While looking at the patch for fixing the sparse vectors (MAHOUT-1190), I
> started working with vector Iterators doing what I thought was reasonable.
>
> This is the important snippet:
> [...]
>         thisIterator = this.iterateNonZero();
>         thatIterator = other.iterateNonZero();
>         thisElement = thatElement = null;
>         boolean advanceThis = true;
>         boolean advanceThat = true;
>         OrderedIntDoubleMapping thisUpdates = new
> OrderedIntDoubleMapping();
>
>         while (thisIterator.hasNext() && thatIterator.hasNext()) {
>           if (advanceThis) {
>             thisElement = thisIterator.next();
>           }
>           if (advanceThat) {
>             thatElement = thatIterator.next();
>           }
> [... advanceThis and advanceThat are set to true based on which iterator to
> advance...]
>
> The problem here is that when calling next(), the iterator state gets
> invalidated and when calling hasNext() the iterator will be advanced
> accordingly and the element references will point to the next element
> (which is mutated).
>
> So, if the indices start at:
> 52 and 87
> despite wanting to only advance the 52, since both were accessed with
> next(), they are both modified.
>
> Here's another snippet with this behavior [1]:
>
>     Vector vector = new SequentialAccessSparseVector(100);
>     vector.set(0, 1);
>     vector.set(2, 2);
>     vector.set(4, 3);
>     vector.set(6, 4);
>     Iterator<Vector.Element> vectorIterator = vector.iterateNonZero();
>     Vector.Element element = null;
>     int i = 0;
>     while (vectorIterator.hasNext()) {
>       if (i % 2 == 0) {
>         element = vectorIterator.next();
>       }
>       System.out.printf("%d %d %f\n", i, element.index(), element.get());
>       ++i;
>     }
>
>
> The output is:
> 0 0 1.000000
> 1 2 2.000000
> 2 2 2.000000
> 3 4 3.000000
> 4 4 3.000000
> 5 6 4.000000
> 6 6 4.000000
>
> I expected it to be:
> 0 0 1.000000
> 1 0 1.000000
> 2 2 2.000000
> 3 2 2.000000
> 4 4 3.000000
> 5 4 3.000000
> 6 6 4.000000
>
> So, I'm completely wrong. Is this just me not understanding what an
> iterator is supposed to do?
>
> [1] https://gist.github.com/dfilimon/5373271
>



-- 

  -jake

Reply via email to