1) Regarding content hashing there might be a question which hash function we should use. For instance, there's https://github.com/OpenHFT/Zero-Allocation-Hashing that offers fast implementations of some hash functions. FarmHash, CityHash, MurmurHash3 We might want to apply it to other "MD5" usages.
2) Philippe>this has a certain CPU impact related to HTML parsing to extract the links. Do you have some numbers that represent "certainty"? 3) Re "cache HTML parsing", it does not sound to be very useful. Typical pages I see have different content, so the cache there does not sound promising 4) What if we implement "fetch links only during the first sampler execution"? As far as I understand, the idea of "fetching resources automatically" is that users do not have to hard-code the resources right into jmx. It might be OK if we implement Cache<TestElement, List<URL>> kind of thing. Vladimir
