Philippe Mouawad> > "certain" in my sentence does not mean "certainty" :-) at least from what > I understand in english. >
Of course I mean "please provide some measurements of the parsing overhead" :-) Philippe Mouawad> > It more means "an impact of a certain degree". > No numbers, more of reasoning that Parsing (based on Jodd or JSoup) comes > at the cost of Regexp parsing, which I think has certainly :-) a cost right > ? > Do you have some numbers to compare? Of course HTML parsing is not free. The basic question is how much CPU does it take, so we can analyze/compare/reproduce that. Philippe Mouawad> > That was my doubt. But take an ecommerce website where part of users are > navigating anonymously, don't you think an important part of the pages is > similar ? > - product page > - home page > - category page > ... > I do not have such experience, so I cannot tell what would be the hit rate. Philippe> Maybe user could indicate in a way when to optimize and when not ? That reminds me http://mrale.ph/blog/2015/01/11/whats-up-with-monomorphism.html For instance: make each HTTP samplers store additional state. The state is one of "unknown" (initial), "has duplicates" (that is when we check cache first), "always unique" (avoid caching as sampler is known to sending unique outputs). So the first several executions we estimate if the sampler is worth caching, then we switch into "has duplicates" or "always unique" mode. Philippe>Maybe user could indicate in a way when to optimize and when not ? The lesser the number of knobs the better the UX is. I would try some automatic solution first, then semi-automatic, then fully manual. > > 4) What if we implement "fetch links only during the first sampler > > execution"? > > > > Can you give more details on your idea ? > On the first sampler execution, do proper HTML parsing and collect the external links. Then make a pokerface and just assume that this particular test element would always return the same set of resources no matter what. Of course it will not work for the cases like url=${home_or_product_page_based_on_the_moons_phase}, but for certain cases where the sampler is dedicated to one particular type of page it might work just fine. Vladimir
