No, it doesn't. JTidy works well.

I'm suspecting your guess is wrong... :-)

--
Salut,

Jordi.

En/na peter lin ha escrit:
can you verify if the old JTidy implementation
contains the same bug?

I'm going to guess it's how I'm using htmlparser.

peter


--- Jordi Salvat i Alabart <[EMAIL PROTECTED]> wrote:


Responding to myself again...

I've been running some more tests with JVM arguments
that I believe more sensible, namely:


-Xms256m -Xmx256m -XX:NewSize=64m -XX:MaxNewSize=64m

-XX:MaxLiveObjectEvacuationRatio=40
-XX:SurvivorRatio=8

With this, the performance difference has almost
disappeared: I'm getting ca. 12 sample/second with the htmlparser, 15
sample/second with the regexp approach. The htmlparser solution
generates about 5 times more garbage than the regexp solution -- which
explains why the results were so tremendously different using -Xincgc.


In this situation, I don't believe it's worth
providing users with the ability to choose which parser they want. I won't
remove them now, but I believe HtmlParser is the best choice,... once we'll
have managed to clean the outstanding bugs.


The bugs I mentioned before (failure to parse a
couple of image URLs) still hold. I'll file them now.


--
Salut,

Jordi.

En/na Jordi Salvat i Alabart ha escrit:

Hi.

I've finally found some time to test the

performance of the


HTTPSamplerFull implementation currently in CVS

(developped by Peter Lin


using HTMLParser) against the implementation I

sent a while ago to the


list (developped by me using Regexps). [Remember:

the objective is not


to decide which is best, but whether it's worth

having both available to


script developers].

The results are not conclusive, but they prove

that the issue deserves


further analysis:

1/ On the example I've been using, the

Regexp-based implementation was


more accurate than the HTMLParser-based one. This

is very surprising to


me, since I expected the Regexp-based

implementation to be generally


less accurate. I'll need some help on this one.

More details later.


2/ On the example I've been using, the

Regexp-based implementation was


at least 7 times faster than the HTTPParser-based

one. A quick look at


the code suggests that the HTML Parser is being

called 5 times (one for


each tag of interest: img, applet, input, body,

table). Am I correct?


The regexp-based implementation only scans through

the HTML once. This


could well explain most of the performance

difference. Is there any way


to recode the HTMLParser-based implementation to

do the job in a single


scan?

How to reproduce the test:
- Get Apache and JMeter running (I'm running both

on the same box, which


is probably a bad idea).
- Uncompress the attached test-httpsamplerfull.tgz

in the Apache


docroot. It contains a Yahoo home page saved using

Mozilla 1.5. (A


proper test would use several other samples).
- Run the attached script and look at the Rate in

the Aggregate Report.


On my IBM T30 with Pentium 4 M @ 2.2 GHz, 1 GB

RAM, with JDK 1.4.2_02,


no fiddling with the java arguments (yes, that

means I'm using -Xincgc,


which is probably the worst possible choice) I'm

getting around 1


sample/second with the HTPMLParser-based sampler

and around 7


sample/second with the Regexp-based one.

In addition, the HTMLParser-based implementation

is failing to download


two images: powrdbyhp_blu_84x28_yahoo.gif (it is

downloading the HTML


page again instead) and 031121_l300.gif (it

downloads nothing). I've


used Mozilla's "Live HTTP Headers" to see what

Mozilla does and it


matches what the Regexp-based implementation is

doing. I'd say there's a


bug in the HTMLParser. Can someone familiar with

it have a look? (Hi


Peter!).




------------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>
<node>
<testelement

class="org.apache.jmeter.testelement.TestPlan">


<testelement

class="org.apache.jmeter.config.Arguments" name="TestPlan.user_defined_variables">

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.gui_class">org.apache.jmeter.config.gui.ArgumentsPanel</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.test_class">org.apache.jmeter.config.Arguments</property>

<collection class="java.util.ArrayList"

propType="org.apache.jmeter.testelement.property.CollectionProperty"

name="Arguments.arguments"/>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.name">Argument List</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.BooleanProperty"

name="TestElement.enabled">true</property>

</testelement>
<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.gui_class">org.apache.jmeter.control.gui.TestPlanGui</property>

<collection class="java.util.LinkedList"

propType="org.apache.jmeter.testelement.property.CollectionProperty"

name="TestPlan.thread_groups"/>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.test_class">org.apache.jmeter.testelement.TestPlan</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.BooleanProperty"

name="TestPlan.serialize_threadgroups">false</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.name">Test Plan</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.BooleanProperty"

name="TestElement.enabled">true</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.BooleanProperty"

name="TestPlan.functional_mode">false</property>

</testelement>
<node>
<testelement

class="org.apache.jmeter.threads.ThreadGroup">


<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.gui_class">org.apache.jmeter.threads.gui.ThreadGroupGui</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.LongProperty"

name="ThreadGroup.start_time">0</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.test_class">org.apache.jmeter.threads.ThreadGroup</property>

<testelement

class="org.apache.jmeter.control.LoopController" name="ThreadGroup.main_controller">

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.gui_class">org.apache.jmeter.control.gui.LoopControlPanel</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.IntegerProperty"

name="LoopController.loops">-1</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.test_class">org.apache.jmeter.control.LoopController</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.StringProperty"

name="TestElement.name">Loop Controller</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.BooleanProperty"

name="TestElement.enabled">true</property>

<property xml:space="preserve"

propType="org.apache.jmeter.testelement.property.BooleanProperty"

name="LoopController.continue_forever">false</property>

</testelement>

=== message truncated ===


__________________________________ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to