On 20/11/2009, rosiere <[email protected]> wrote:
>
> Hello,
>
> Thanks for your explanation.
> In fact the HTML layout that I try to parse is stable and hardly subjected
> to future change, that's why I need to parse it.
>
> Now that I'm not goot at regex, I will use JMeter just to get the HTML
> response from an https-based web site, and to store parsing results in java
> objects like ArrayList.
>
> So I created some Http request samplers, then attached a BeanShell
> PostProcessor to it.
> In the BeanShell script, I wrote some logic with dom w3c and jtidy API, and
> now I can see the extracted cell contents by System.err.println() in my
> BeanShell.
You could have saved yourself some work by using the XPath Extractor...
> After that I had difficulties about JMeter variables usage. In my BeanShell
> script I created ArrayList objects and stored extracted texts in them, and
> put them into JMeter context:
> vars.put("responseList", responseList);
> vars.put("responseDateList", responseDateList);
> http://old.nabble.com/file/p26443545/BeanShellPostProcessor.gif
>
> After having parsed my HTML response, I would need a ForEach Controller to
> iterate on these List objects' elements (which are just an array of values
> in selected <td> elements), and to issue JDBC request to store them in
> database (or any other possible operations to send them out of JMeter).
> http://old.nabble.com/file/p26443545/ForEachController.gif
>
> However I was unable to get a ForEach Controller operate on objects in vars.
>
> What did I miss and what should I do to iterate on vars' content and run a
> sampler on each value in the iteration?
>
> With my best wishes,
>
> Rosière
>
>
>
> Deepak Shetty wrote:
> >
> > Hi
> > the regex you are using doesnt seem correct
> > [^tr]
> > is any character that is not 't' or not 'r' it doesnt mean not the
> > sequence
> > tr.
> >
> > Plus if you are getting multiple <tr> instead of 1 that you expect your
> > regex is probably too greedy try replacing .* constructs with .*? or
> > modify
> > the regex
> >
> > In any case XPath is as dependent on HTML structure as a Regex is (e.g.
> > what
> > if you move to a tableless layout)
> >
> >
> > regards
> > deepak
> >
> > On Thu, Nov 19, 2009 at 8:17 AM, rosiere <[email protected]> wrote:
> >
> >>
> >> Hello,
> >>
> >> Thanks for your advice.
> >>
> >> I did applied case insensitive check: like this:
> >>
> >> (?is)<tr\sclass="tgDataLine.*1\)\" >([^tr].*)</tr>
> >>
> >> However I still face problem. Now I capture all <tr> elements in a same
> >> group instead of each <tr> element.
> >>
> >> I read in my jmeter.log these informations about matching:
> >>
> >> 2009/11/19 17:03:33 DEBUG - jmeter.extractor.RegexExtractor: Regex =
> >> (?is)<tr\sclass="tgDataLine.*1\)\" >([^tr].*)</tr>
> >> 2009/11/19 17:03:33 DEBUG - jmeter.extractor.RegexExtractor:
> >> RegexExtractor:
> >> Match found!
> >> 2009/11/19 17:03:33 DEBUG - jmeter.extractor.RegexExtractor:
> >> RegexExtractor:
> >> Template piece #0 = 1
> >> 2009/11/19 17:03:33 DEBUG - jmeter.extractor.RegexExtractor:
> >> RegexExtractor:
> >> Template piece #1 =
> >> 2009/11/19 17:03:33 DEBUG - jmeter.extractor.RegexExtractor: Regex
> >> Extractor
> >> result =
> >> <TD>....<TD>
> >> <TR>...</TR>
> >> ...
> >> <TR>....</TR>
> >> <TD>
> >>
> >>
> >> As for alternatives, I did want to parse a HTML with org.w3c.dom api, but
> >> dom methods like getElementsByTagName() are all case sensitive and may
> >> not
> >> be able to parse an HTML with both uppercase and lowercase tags.
> >>
> >> Besides, whenever the HTML page changes, I will have to rewrite my Java
> >> code
> >> based on dom api. So in order to minimize these unwanted effects on my
> >> Java
> >> code, I would still like to use regex, so that, whenever HTML structure
> >> changes, I need only change the regex in JMeter but not my java code that
> >> cosumes the extracted HTML portions.
> >>
> >>
> >>
> >> Deepak Shetty wrote:
> >> >
> >> > You should probably make the check case insensitive. but I agree with
> >> sebb
> >> > ,
> >> > parsing html constructs with regex is a pain and breaks quite
> >> frequently
> >> > regards
> >> > deepak
> >> >
> >> > On Wed, Nov 18, 2009 at 10:37 AM, Andre Arnold <[email protected]>
> >> wrote:
> >> >
> >> >> sebb schrieb:
> >> >> > On 18/11/2009, rosiere <[email protected]> wrote:
> >> >> >
> >> >> >> Hello,
> >> >> >>
> >> >> >> I found that JMeter's oro regex is somehow different from java's.
> >> >> >>
> >> >> >
> >> >> > Yes.
> >> >> >
> >> >> > But not all that different; and neither is particularly well suited
> >> to
> >> >> > this task.
> >> >> >
> >> >> > The XPath Extractor will probably be much easier to use.
> >> >> >
> >> >> >
> >> >>
> >>
> http://jakarta.apache.org/jmeter/usermanual/component_reference.html#XPath_Extractor
> >> >> >
> >> >> > This was discussed on the mailing list earlier this year.
> >> >> >
> >> >> >
> >> >> >> Now I need to iterate on different <tr> that matches a pattern,
> >> then:
> >> >> >> capture all the <td> elements within each <tr> , and select the
> >> 8th
> >> >> and 9th
> >> >> >> <td>.
> >> >> >>
> >> >> >> Since many <tr> elements appears in the HTML response, in order to
> >> do
> >> >> this I
> >> >> >> have to capture <tr> line by line without including two lines in a
> >> >> same
> >> >> >> group:
> >> >> >>
> >> >> >> so I should avoid capturing continuous <tr>..</tr><tr>..</tr>
> >> into
> >> >> the
> >> >> same
> >> >> >> group.
> >> >> >>
> >> >> >> By writing (?is)<tr\sclass="tgDataLine.*1\)\" >(.*)</tr> I will
> >> >> capture
> >> >> only
> >> >> >> one group that contains many real <tr> elements
> >> >> >> So what should I write in the regex?
> >> >> >>
> >> >> >>
> >> >> If you still need a pattern to match your needs.
> >> >> I found that the following matches your the number you wanted and the
> >> >> following column value.
> >> >>
> >> >> reference: ref
> >> >> pattern: (?s)<TR.+?<TD.+?>([1-9|0]+?)</TD.+?<TD.+?>(.+?)</TD>
> >> >> template: $1$$2$
> >> >> match : 1
> >> >>
> >> >> In ref_g1 you'll find the number.
> >> >> In ref_g2 you'll find the following column value.
> >> >>
> >> >> To catch all the matches you need to increment a counter for the match
> >> >> and check wether there is another one or not.
> >> >>
> >> >> Your Testplan should look sth like this:
> >> >>
> >> >> -while controller (${__javaScript("${ref}"!="error")} )
> >> >> --counter (from 1 with increment 1 for the regex match value)
> >> >> --Http Sampler (to get your site)
> >> >> ---RegEx Extractor (as shown above)
> >> >> --if controller( same as while controller--> ${ref}"!="error" )
> >> >> ---your jdbc action (use ref_g1 & ref_g2)
> >> >>
> >> >>
> >> >> Hope I got your problem right.
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: [email protected]
> >> >> For additional commands, e-mail: [email protected]
> >> >>
> >> >>
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/How-can-I-extract-cell-data-%28content-surrounded-by-%3Ctd%3E%3C-td%3E%29-from-a-%3Ctable%3E-in-HTML-response--tp26371440p26421379.html
> >> Sent from the JMeter - User mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
> >
> >
>
>
> --
> View this message in context:
> http://old.nabble.com/How-can-I-extract-cell-data-%28content-surrounded-by-%3Ctd%3E%3C-td%3E%29-from-a-%3Ctable%3E-in-HTML-response--tp26371440p26443545.html
>
> Sent from the JMeter - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]