Also the calculation for cost is wrong. RASV is HashMap based its cost O(n) not O(log n)
------ Robin Anil On Mon, Apr 22, 2013 at 4:43 PM, Robin Anil <[email protected]> wrote: > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/10669/ > > math/src/main/java/org/apache/mahout/math/AbstractVector.java<https://reviews.apache.org/r/10669/diff/2/?file=283185#file283185line56> > (Diff > revision 2) > > None > > 55 > > boolean commutativeAndAssociative = aggregator.isCommutative() && > aggregator.isAssociative(); > > This should be a method in AbstractFunction interface > > > > math/src/main/java/org/apache/mahout/math/AbstractVector.java<https://reviews.apache.org/r/10669/diff/2/?file=283185#file283185line1037> > (Diff > revision 2) > > {'text': 'public int getNumNonZeroElements() {', 'line': 726, > 'expand_offset': 2} > > {'text': 'public int getNumNonZeroElements() {', 'line': 927, > 'expand_offset': 2} > > 942 > > // Make all the non-zero values 0. > > vector.clear() > > > > math/src/main/java/org/apache/mahout/math/AbstractVector.java<https://reviews.apache.org/r/10669/diff/2/?file=283185#file283185line1047> > (Diff > revision 2) > > {'text': 'public int getNumNonZeroElements() {', 'line': 726, > 'expand_offset': 2} > > {'text': 'public int getNumNonZeroElements() {', 'line': 927, > 'expand_offset': 2} > > 952 > > while (it.hasNext()) { > > There is no guarantee of ordered iteration. The updates is unsorted. merge > typically implies that input is sorted > > > > math/src/main/java/org/apache/mahout/math/AbstractVector.java<https://reviews.apache.org/r/10669/diff/2/?file=283185#file283185line1074> > (Diff > revision 2) > > {'text': 'public int getNumNonZeroElements() {', 'line': 726, > 'expand_offset': 2} > > {'text': 'public int getNumNonZeroElements() {', 'line': 927, > 'expand_offset': 2} > > 979 > > element.set(values[index]); > > What if there are values that are not in the existing data. I dont this is > correct > > > > math/src/main/java/org/apache/mahout/math/AbstractVector.java<https://reviews.apache.org/r/10669/diff/2/?file=283185#file283185line1100> > (Diff > revision 2) > > {'text': 'public int getNumNonZeroElements() {', 'line': 726, > 'expand_offset': 2} > > {'text': 'public int getNumNonZeroElements() {', 'line': 927, > 'expand_offset': 2} > > 999 > > invalidateCachedLength(); > > move invalidateCachedLength to the set method. Instead of dispersing it > everywhere. > > > > math/src/main/java/org/apache/mahout/math/AbstractVector.java<https://reviews.apache.org/r/10669/diff/2/?file=283185#file283185line1115> > (Diff > revision 2) > > None > > {'text': ' private Vector assignIterateBoth(Iterator<Element> thisIterator, > Iterator<Element> thatIterator,', 'line': 1070} > > 1014 > > protected Vector assignSkipZerosIterateOne(Iterator<Element> thisIterator, > Vector other, > > be consistent, this that or thisIterator that Iterator. > > > > math/src/main/java/org/apache/mahout/math/AbstractVector.java<https://reviews.apache.org/r/10669/diff/2/?file=283185#file283185line1171> > (Diff > revision 2) > > None > > {'text': ' private Vector assignIterateBoth(Iterator<Element> thisIterator, > Iterator<Element> thatIterator,', 'line': 1070} > > 1070 > > private Vector assignIterateBoth(Iterator<Element> thisIterator, > Iterator<Element> thatIterator, > > This expects the data to to be sorted. indicate that in a comment. > Similarly comment other places > > > > math/src/main/java/org/apache/mahout/math/function/Functions.java<https://reviews.apache.org/r/10669/diff/2/?file=283202#file283202line900> > (Diff > revision 2) > > None > > {'text': ' public boolean isLikeLeftMult() {', 'line': 892} > > 896 > > /** > > Move this comments to the base class > > > - Robin > > On April 22nd, 2013, 8:45 p.m., Dan Filimon wrote: > Review request for mahout, Ted Dunning, Sebastian Schelter, and Robin > Anil. > By Dan Filimon. > > *Updated April 22, 2013, 8:45 p.m.* > Description > > This patch contains code cleaning up AbstractVector and making the operations > as fast as possible while still having a high level interface. > > The main changes are in AbstractVector as well as new methods in > DoubleDoubleFunction. > > Testing > > The vectors test pass but it's likely that the patch in it's current state is > broken as other, unrelated tests (BallKMeans...) are failing. > Also, my Hadoop conf is broken so I didn't run all the core tests. Anyone? > > I can't seem to find the bug, so _please_ have a closer look. It's still a > work in progress. > > The benchmarks seem comparable (although there are some jarring diferences – > Minkowski distance seems a lot slower in new-dan-1 than old-trunk-2). It may > be however that this is just variance due to the load of the machine at the > time. I'm having trouble interpreting the benchmarks in general, so anyone > who could give me a hand is more than welcome. > > Diffs > > - math/src/main/java/org/apache/mahout/math/AbstractMatrix.java > (e12aa38) > - math/src/main/java/org/apache/mahout/math/AbstractVector.java > (090aa7a) > - math/src/main/java/org/apache/mahout/math/Centroid.java (0c42196) > - math/src/main/java/org/apache/mahout/math/ConstantVector.java > (51d67d4) > - math/src/main/java/org/apache/mahout/math/DelegatingVector.java > (12220d4) > - math/src/main/java/org/apache/mahout/math/DenseVector.java (41c356b) > - > math/src/main/java/org/apache/mahout/math/FileBasedSparseBinaryMatrix.java > (094003b) > - math/src/main/java/org/apache/mahout/math/MatrixSlice.java (7f79c96) > - math/src/main/java/org/apache/mahout/math/MatrixVectorView.java > (af70727) > - math/src/main/java/org/apache/mahout/math/NamedVector.java (4b7a41d) > - math/src/main/java/org/apache/mahout/math/OrderedIntDoubleMapping.java > (650d82d) > - math/src/main/java/org/apache/mahout/math/PermutedVectorView.java > (d1ea93a) > - math/src/main/java/org/apache/mahout/math/RandomAccessSparseVector.java > (6f85692) > - > math/src/main/java/org/apache/mahout/math/SequentialAccessSparseVector.java > (21982f9) > - math/src/main/java/org/apache/mahout/math/Vector.java (2f8b417) > - math/src/main/java/org/apache/mahout/math/VectorView.java (add2a60) > - math/src/main/java/org/apache/mahout/math/WeightedVector.java > (06fbd05) > - > math/src/main/java/org/apache/mahout/math/function/DoubleDoubleFunction.java > (82b350a) > - math/src/main/java/org/apache/mahout/math/function/Functions.java > (eb2e42f) > - math/src/main/java/org/apache/mahout/math/function/PlusMult.java > (60587b1) > - math/src/main/java/org/apache/mahout/math/function/TimesFunction.java > (8ab0bb1) > - math/src/main/java/org/apache/mahout/math/jet/math/Constants.java > (53535d6) > - math/src/test/java/org/apache/mahout/math/AbstractVectorTest.java > (2b11199) > - math/src/test/java/org/apache/mahout/math/VectorTest.java (d6d554b) > > View Diff <https://reviews.apache.org/r/10669/diff/> >
