Hi,

I think in MIA code, ch09, RandomPointUtil.java , chooseRandomPoints()
method is ?????
I guess we'd like to have uniform random sampling (like Resevoir Sampling)

--- old code

  public static List<Vector> chooseRandomPoints(Iterable<Vector> vectors,
int k) {

    List<Vector> chosenPoints = new ArrayList<Vector>(k);

    Random random = RandomUtils.getRandom();

    for (Vector value : vectors) {

      int currentSize = chosenPoints.size();

      if (currentSize < k) {

        chosenPoints.add(value);

      } else if (random.nextInt(currentSize + 1) == 0) { // with chance
1/(currentSize+1) pick new element

        int indexToRemove = random.nextInt(currentSize); // evict one
chosen randomly

        chosenPoints.remove(indexToRemove);

        chosenPoints.add(value);

      }

    }

    return chosenPoints;

  }

---  suggested code

 public static List<Vector> chooseRandomPoints(Iterable<Vector> vectors, intk) {

    List<Vector> chosenPoints = new ArrayList<Vector>(k);

    Random random = RandomUtils.getRandom();

    // resevoir sampling

    int totalSize = 0;

    for (Vector value : vectors) {

    totalSize++;

      int currentSize = chosenPoints.size();

      if (currentSize < k) {

        chosenPoints.add(value);

      } else if (random.nextInt(totalSize) < currentSize) { // with chance
k/totalSize, pick new element

        int indexToRemove = random.nextInt(currentSize); // evict one
chosen randomly

        chosenPoints.remove(indexToRemove);

        chosenPoints.add(value);

      }



    }

    return chosenPoints;

  }

------

Am I right ?

Sam

Reply via email to