Re: HTML Value Extractor

peter lin Fri, 03 Oct 2003 05:34:26 -0700

 
The over all architecture of HTMLParser is based on the idea of listeners, so for 
example, if you are parsing links, it may include images. If you're parsing select, it 
will include options.
 
the primary difference between how tidy and htmlparser does it is tidy builds a DOM. 
We know from experience it is ok for low number of threads. The speed of tidy is 
acceptable, but the memory usage for 50+ threads in a threadgroup is where it gets 
slow.
 
It's an inherent problem of DOM and not the fault of tidy or JMeter. If you use the 
form nodes as single entities without relationship to the entire html document, then 
you're fine. If you need to check against dynamic nodes, values in the entire 
document, then DOM is really the easiest choice.
 
if you're really interested in the details, the parser class in htmlparser provides 
examples of how FormScanner class is registered with the parser. I hope that helps. if 
you have specific questions, feel free to email me directly. I'm not an expert on 
htmlparser, but it would make sense to address the needs at the same time I swap out 
tidy for htmlparser.
 
peter lin



Joseph Fifield <[EMAIL PROTECTED]> wrote:
Sounds good. In the meantime, I have the HTML Form Field Value Extractor
working using o.a.j.protocol.http.parser.HtmlParser.getDOM. It was actually
pretty straightforward (and it allowed me to prove the concept and get my
hands in the source a bit). Let me know what happens with HTMLParser and I
will look into changing it to use that.

You also mentioned an HTML Select Options Extractor. What exactly would this
do? The HTML Form Field Value Extractor can already extract a single value
from a select element (given the option text). Were you thinking this would
be a whole different test element, or would it do more?

Joe

----- Original Message ----- 
From: "peter lin" 
To: "JMeter Developers List" 
Sent: Thursday, October 02, 2003 7:52 PM
Subject: Re: HTML Value Extractor


>
>
> Hi joseph,
>
>
> if you can wait for a few days, i will hear back from the developers of
HTMLParser about using it directly, instead of requiring users to download
the jar separately. HTMLParser is licensed under LGPL, so I'm waiting to
hear from them. right now the lead developer/maintainer likes the idea and
we're just waiting for feedback from the others.
>
> I hope to help make HTMLParser a jakarta project, since I think of plenty
of ways to use HTMLParser within JMeter :)
>
> if you haven't look at the performance results, HTMLParser appears to
scale linearly and uses less memory when the number of nodes you need to get
are less than 50.
>
> peter lin
>
>
> Joseph Fifield wrote:
> I don't have an immediate need for the others, but I'm sure some of them
> will come up as the tests grow. I can already think of places I could use
> the HTML Link HRef Extractor.
>
> I started toying with some ideas by using
> o.a.j.protocol.http.parser.HtmlParser.getDOM and looking for matching
nodes.
> I assume this isn't the most efficient way of doing it. I saw some
> references to HTMLParser (http://htmlparser.sourceforge.net) in other
> replies. Is that already included in JMeter? Should I look into using that
> instead?
>
> Joe
>
> ----- Original Message ----- 
> From: "Jordi Salvat i Alabart"
> To: "JMeter Developers List"
> Sent: Thursday, October 02, 2003 1:57 PM
> Subject: Re: HTML Value Extractor
>
>
> > It sounds great. You've missed nothing. I don't think there's a simpler
> way.
> >
> > This was actually the original idea behind Extractors. I've been trying
> > for months to find time to work on this, but unfortunately I can hardly
> > keep up with work & family responsibilities...
> >
> > If you have the time, here's some ideas:
> >
> > - HTML Form Field Value Extractor
> >
> > You said it.
> >
> > - HTML Header Extractor
> >
> > User provides a header name. The value of that header will be placed in
> > a variable.
> >
> > - HTML Link HRef Extractor
> >
> > User provides a link name (or regexp) or index. The *full* URL for that
> > link will be placed in a variable (note it's not that easy to do that
> > using the Regexp Extractor).
> >
> > Would be really nice to be able to extract multiple links to get an
> > array, but JMeter currently has no mechanism to process such an array...
> > maybe a ForEach Logic Controller?
> >
> > - HTML Form Action Extractor
> >
> > User provides form name (or regexp) or index, or content or regexp for
> > the action field. The *full* URL for the action field in that form will
> > be placed in a variable.
> >
> > More:
> > - HTML Img Src Extractor
> > - HTML Select Options Extractor
> > ...
> >
> > Please please if you work on that, send us the code. I'll be very happy
> > to test and commit it.
> >
> > Salut,
> >
> > Jordi.
> >
> > Joseph Fifield wrote:
> > > Hello,
> > >
> > > I need to pull a value out of an http response to use in the next
> request.
> > > Specifically, I need to get a value from an html form element from a
> > > response. Now, I already got this working using the Regular Expression
> > > Extractor, and it works great. However, I need a solution that's a bit
> > > easier for a non-programmer to use (and not have to deal with regex).
> > >
> > > I started working on what I'm currently calling HTML Value Extractor.
It
> is
> > > a post processor implementation that simply pulls the value from a
form
> > > element (as specified by the test element properties) and puts it into
a
> > > variable. The test element properties include the variable name, the
> type of
> > > form element (right now, just input and select), the name of the form
> > > element, and for select elements, the text of the option element to
> select.
> > >
> > > How does this sound? Is there already a simpler way that I've just
> missed
> > > entirely? I'm also curious if there would be any interest in adding
this
> new
> > > test element to JMeter once I've finished it?
> > >
> > > Thanks!
> > >
> > > Joe
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > > For additional commands, e-mail: [EMAIL PROTECTED]
> > >
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> ---------------------------------
> Do you Yahoo!?
> The New Yahoo! Shopping - with improved product search


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search

Re: HTML Value Extractor

Reply via email to