[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836624#action_12836624 ]
Robin Anil commented on MAHOUT-300: ----------------------------------- I think the irregularity is due to the sparse vector generation process where duplicate index values could get generated leaving some vectors much sparser than the sparsity value {code} Vector v = new SequentialAccessSparseVector(cardinality, sparsity); // sparsity! int[] indexes = new int[sparsity]; double[] values = new double[sparsity]; for (int j = 0; j < sparsity; j++) { double value = r.nextGaussian(); int index = sparsity < cardinality ? r.nextInt(cardinality) : j; v.set(index, value); indexes[j] = index; values[j] = value; } {code} instead i suggest this {code} Vector v = new SequentialAccessSparseVector(cardinality, sparsity); // sparsity! boolean[] featureSpace = new boolean[cardinality]; int[] indexes = new int[sparsity]; double[] values = new double[sparsity]; int j = 0; while(j < sparsity) { double value = r.nextGaussian(); int index = r.nextInt(cardinality); if(featureSpace[index] == false) { featureSpace[index] = true; indexes[j] = index; values[j++] = value; v.set(index, value); } } {code} > Solve performance issues with Vector Implementations > ---------------------------------------------------- > > Key: MAHOUT-300 > URL: https://issues.apache.org/jira/browse/MAHOUT-300 > Project: Mahout > Issue Type: Improvement > Affects Versions: 0.3 > Reporter: Robin Anil > Fix For: 0.3 > > Attachments: MAHOUT-300.patch, MAHOUT-300.patch, MAHOUT-300.patch, > MAHOUT-300.patch, MAHOUT-300.patch, MAHOUT-300.patch > > > AbstractVector operations like times > public Vector times(double x) { > Vector result = clone(); > Iterator<Element> iter = iterateNonZero(); > while (iter.hasNext()) { > Element element = iter.next(); > int index = element.index(); > result.setQuick(index, element.get() * x); > } > return result; > } > should be implemented as follows > public Vector times(double x) { > Vector result = clone(); > Iterator<Element> iter = result.iterateNonZero(); > while (iter.hasNext()) { > Element element = iter.next(); > element.set(element.get() * x); > } > return result; > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.