Looks like it. Steps from here include:
1) file a JIRA 2) Can you produce a test case to make the case? 3) one of us commits the fix. On Sat, Jan 12, 2013 at 2:58 PM, sam wu <[email protected]> wrote: > Hi, > > I think in MIA code, ch09, RandomPointUtil.java , chooseRandomPoints() > method is ????? > I guess we'd like to have uniform random sampling (like Resevoir Sampling) > > --- old code > > public static List<Vector> chooseRandomPoints(Iterable<Vector> vectors, > int k) { > > List<Vector> chosenPoints = new ArrayList<Vector>(k); > > Random random = RandomUtils.getRandom(); > > for (Vector value : vectors) { > > int currentSize = chosenPoints.size(); > > if (currentSize < k) { > > chosenPoints.add(value); > > } else if (random.nextInt(currentSize + 1) == 0) { // with chance > 1/(currentSize+1) pick new element > > int indexToRemove = random.nextInt(currentSize); // evict one > chosen randomly > > chosenPoints.remove(indexToRemove); > > chosenPoints.add(value); > > } > > } > > return chosenPoints; > > } > > --- suggested code > > public static List<Vector> chooseRandomPoints(Iterable<Vector> vectors, > intk) { > > List<Vector> chosenPoints = new ArrayList<Vector>(k); > > Random random = RandomUtils.getRandom(); > > // resevoir sampling > > int totalSize = 0; > > for (Vector value : vectors) { > > totalSize++; > > int currentSize = chosenPoints.size(); > > if (currentSize < k) { > > chosenPoints.add(value); > > } else if (random.nextInt(totalSize) < currentSize) { // with chance > k/totalSize, pick new element > > int indexToRemove = random.nextInt(currentSize); // evict one > chosen randomly > > chosenPoints.remove(indexToRemove); > > chosenPoints.add(value); > > } > > > > } > > return chosenPoints; > > } > > ------ > > Am I right ? > > Sam >
