I think that this should be as fast and perhaps more clear. I would
normally expect almost any clean loop of this form to be about equal in
speed, but Robin and/or Jake had some surprising speed numbers a while back
on similar code where a not so pretty loop was a bit faster. On the other
hand, we need simple and correct first.
Iterator<Element> myIter = iterateNonZero();
Iterator<Element> otherIter = x.iterateNonZero();
double result = 0.0;
if (myIter.hasNext() && otherIter.hasNext()) {
Element myCurrent = myIter.next();
Element otherCurrent = otherIter.next();
while (myIter.hasNext() && otherIter.hasNext()) {
int myIndex = myCurrent.index();
int otherIndex = otherCurrent.index();
if (myIndex == otherIndex) {
result += myCurrent.get() * otherCurrent.get();
}
if (myIndex <= otherIndex) {
myCurrent = myIter.next();
}
if (myIndex >= otherIndex) {
otherCurrent = otherIter.next();
}
}
}
return result;
On Fri, Jun 11, 2010 at 3:56 AM, Sean Owen <[email protected]> wrote:
> Yeah it's a bug.
>
> The section is intended to be an optimization. But it ought to go
> something like this. This is the best structure I could find for a
> similar problem in AbstractSimilarity.
>
> I'll commit this if there are no better fixes and tests do pass.
>
> Iterator<Element> myIter = iterateNonZero();
> Iterator<Element> otherIter = x.iterateNonZero();
> if (!myIter.hasNext() || !otherIter.hasNext()) {
> return 0.0;
> }
> Element myCurrent = myIter.next();
> Element otherCurrent = otherIter.next();
> double result = 0.0;
> while (true) {
> int myIndex = myCurrent.index();
> int otherIndex = otherCurrent.index();
> if (myIndex == otherIndex) {
> result += myCurrent.get() * otherCurrent.get();
> }
> if (myIndex <= otherIndex) {
> if (!myIter.hasNext()) {
> break;
> }
> myCurrent = myIter.next();
> }
> if (myIndex >= otherIndex) {
> if (!otherIter.hasNext()) {
> break;
> }
> otherCurrent = otherIter.next();
> }
> }
>
> On Fri, Jun 11, 2010 at 10:13 AM, Robin Anil <[email protected]> wrote:
> > This is possibly a bug. Let me check the code.
> >
> > On Fri, Jun 11, 2010 at 2:31 PM, zhao zhendong <[email protected]
> >wrote:
> >
> >> Hi,
> >>
> >> One strange thing, SequentialAccessSparseVector.dot() seems does NOT
> work
> >> correctly. When I change the Vector w = new
> >> RandomAccessSparseVector(Integer.MAX_VALUE, 12); Then the code works
> well.
> >>
> >> I check the source code of SequentialAccessSparseVector.dot(), it checks
> >> the
> >> index of two Vectors(if both of them are instance of
> >> SequentialAccessSparseVector), once they mis-match, it will return 0.0.
> >>
> >> *Why do we need such constrain?
> >> *
> >> In this case, what's the best way to get dot product between two
> >> mis-matched
> >> SequentialAccessSparseVector instances?
> >>
> >> Code:
> >> Vector w = new SequentialAccessSparseVector(Integer.MAX_VALUE, 12);
> >> w.set(1, 0.4);
> >> w.set(2, 0.4);
> >> w.set(3, -0.666666667);
> >>
> >> Vector v = new SequentialAccessSparseVector(Integer.MAX_VALUE, 12);
> >> v.set(3, 1);
> >> System.out.println(datastore.getFeatureRow(2).dot(w));
> >>
> >> The result is 0.0 (should be -0.666666667).
> >>
> >>
> >> --
> >> -------------------------------------------------------------
> >>
> >> Zhen-Dong Zhao (Maxim)
> >>
> >> <><<><><><><><><><>><><><><><>>>>>>
> >>
> >> Department of Computer Science
> >> School of Computing
> >> National University of Singapore
> >>
> >> >>>>>>><><><><><><><><<><>><><<<<<<
> >>
> >
>