On Thu, Aug 11, 2016 at 9:36 PM, Vladimir Sitnikov < [email protected]> wrote:
> Philippe Mouawad> > > > "certain" in my sentence does not mean "certainty" :-) at least from > what > > I understand in english. > > > > Of course I mean "please provide some measurements of the parsing overhead" > :-) > > Philippe Mouawad> > > > It more means "an impact of a certain degree". > > No numbers, more of reasoning that Parsing (based on Jodd or JSoup) comes > > at the cost of Regexp parsing, which I think has certainly :-) a cost > right > > ? > > > > Do you have some numbers to compare? > No, before starting any work on this I wanted to have some feedback. I don't want to spend too much time on potentially bad idea. > Of course HTML parsing is not free. The basic question is how much CPU does > it take, so we can analyze/compare/reproduce that. > > Philippe Mouawad> > > > That was my doubt. But take an ecommerce website where part of users are > > navigating anonymously, don't you think an important part of the pages is > > similar ? > > - product page > > - home page > > - category page > > ... > > > I do not have such experience, so I cannot tell what would be the hit rate. > > > Philippe> Maybe user could indicate in a way when to optimize and when not > ? > > That reminds me > http://mrale.ph/blog/2015/01/11/whats-up-with-monomorphism.html > For instance: make each HTTP samplers store additional state. > The state is one of "unknown" (initial), "has duplicates" (that is when we > check cache first), "always unique" (avoid caching as sampler is known to > sending unique outputs). > > So the first several executions we estimate if the sampler is worth > caching, then we switch into "has duplicates" or "always unique" mode. > > > Philippe>Maybe user could indicate in a way when to optimize and when not ? > > The lesser the number of knobs the better the UX is. I would try some > automatic solution first, then semi-automatic, then fully manual. > > > > > > 4) What if we implement "fetch links only during the first sampler > > > execution"? > > > > > > > Can you give more details on your idea ? > > > > On the first sampler execution, do proper HTML parsing and collect the > external links. Then make a pokerface and just assume that this particular > test element would always return the same set of resources no matter what. > Of course it will not work for the cases like > url=${home_or_product_page_based_on_the_moons_phase}, but for certain > cases > where the sampler is dedicated to one particular type of page it might work > just fine. > > > Vladimir > -- Cordialement. Philippe Mouawad.
