Hi,
I think in MIA code, ch09, RandomPointUtil.java , chooseRandomPoints()
method is ?????
I guess we'd like to have uniform random sampling (like Resevoir Sampling)
--- old code
public static List<Vector> chooseRandomPoints(Iterable<Vector> vectors,
int k) {
List<Vector> chosenPoints = new ArrayList<Vector>(k);
Random random = RandomUtils.getRandom();
for (Vector value : vectors) {
int currentSize = chosenPoints.size();
if (currentSize < k) {
chosenPoints.add(value);
} else if (random.nextInt(currentSize + 1) == 0) { // with chance
1/(currentSize+1) pick new element
int indexToRemove = random.nextInt(currentSize); // evict one
chosen randomly
chosenPoints.remove(indexToRemove);
chosenPoints.add(value);
}
}
return chosenPoints;
}
--- suggested code
public static List<Vector> chooseRandomPoints(Iterable<Vector> vectors, intk) {
List<Vector> chosenPoints = new ArrayList<Vector>(k);
Random random = RandomUtils.getRandom();
// resevoir sampling
int totalSize = 0;
for (Vector value : vectors) {
totalSize++;
int currentSize = chosenPoints.size();
if (currentSize < k) {
chosenPoints.add(value);
} else if (random.nextInt(totalSize) < currentSize) { // with chance
k/totalSize, pick new element
int indexToRemove = random.nextInt(currentSize); // evict one
chosen randomly
chosenPoints.remove(indexToRemove);
chosenPoints.add(value);
}
}
return chosenPoints;
}
------
Am I right ?
Sam