Philippe Mouawad>

> "certain"  in my sentence does not mean "certainty" :-) at least from what
> I understand in english.
>

Of course I mean "please provide some measurements of the parsing overhead"
 :-)

Philippe Mouawad>

> It more means "an impact of a certain degree".
> No numbers, more of reasoning that Parsing (based on Jodd or JSoup) comes
> at the cost of Regexp parsing, which I think has certainly :-) a cost right
> ?
>

Do you have some numbers to compare?
Of course HTML parsing is not free. The basic question is how much CPU does
it take, so we can analyze/compare/reproduce that.

 Philippe Mouawad>

> That was my doubt. But take an ecommerce website where part of users are
> navigating anonymously, don't you think an important part of the pages is
> similar ?
> - product page
> - home page
> - category page
> ...
>
I do not have such experience, so I cannot tell what would be the hit rate.


Philippe> Maybe user could indicate in a way when to optimize and when not ?

That reminds me
http://mrale.ph/blog/2015/01/11/whats-up-with-monomorphism.html
For instance: make each HTTP samplers store additional state.
The state is one of "unknown" (initial), "has duplicates" (that is when we
check cache first), "always unique" (avoid caching as sampler is known to
sending unique outputs).

So the first several executions we estimate if the sampler is worth
caching, then we switch into "has duplicates" or "always unique" mode.


Philippe>Maybe user could indicate in a way when to optimize and when not ?

The lesser the number of knobs the better the UX is. I would try some
automatic solution first, then semi-automatic, then fully manual.



> > 4) What if we implement "fetch links only during the first sampler
> > execution"?
> >
>
> Can you give more details on your idea ?
>

On the first sampler execution, do proper HTML parsing and collect the
external links. Then make a pokerface and just assume that this particular
test element would always return the same set of resources no matter what.
Of course it will not work for the cases like
url=${home_or_product_page_based_on_the_moons_phase}, but for certain cases
where the sampler is dedicated to one particular type of page it might work
just fine.


Vladimir

Reply via email to