I'm suspecting your guess is wrong... :-)
-- Salut,
Jordi.
En/na peter lin ha escrit:
can you verify if the old JTidy implementation contains the same bug?
I'm going to guess it's how I'm using htmlparser.
peter
--- Jordi Salvat i Alabart <[EMAIL PROTECTED]> wrote:
Responding to myself again...------------------------------------------------------------------------
I've been running some more tests with JVM arguments
that I believe more sensible, namely:
-Xms256m -Xmx256m -XX:NewSize=64m -XX:MaxNewSize=64m
-XX:MaxLiveObjectEvacuationRatio=40 -XX:SurvivorRatio=8
With this, the performance difference has almost
disappeared: I'm getting ca. 12 sample/second with the htmlparser, 15
sample/second with the regexp approach. The htmlparser solution
generates about 5 times more garbage than the regexp solution -- which
explains why the results were so tremendously different using -Xincgc.
In this situation, I don't believe it's worth
providing users with the ability to choose which parser they want. I won't
remove them now, but I believe HtmlParser is the best choice,... once we'll
have managed to clean the outstanding bugs.
The bugs I mentioned before (failure to parse a
couple of image URLs) still hold. I'll file them now.
-- Salut,
Jordi.
En/na Jordi Salvat i Alabart ha escrit:
Hi.
I've finally found some time to test the
performance of the
HTTPSamplerFull implementation currently in CVS
(developped by Peter Lin
using HTMLParser) against the implementation I
sent a while ago to the
list (developped by me using Regexps). [Remember:
the objective is not
to decide which is best, but whether it's worth
having both available to
script developers].
The results are not conclusive, but they prove
that the issue deserves
further analysis:
1/ On the example I've been using, the
Regexp-based implementation was
more accurate than the HTMLParser-based one. This
is very surprising to
me, since I expected the Regexp-based
implementation to be generally
less accurate. I'll need some help on this one.
More details later.
2/ On the example I've been using, the
Regexp-based implementation was
at least 7 times faster than the HTTPParser-based
one. A quick look at
the code suggests that the HTML Parser is being
called 5 times (one for
each tag of interest: img, applet, input, body,
table). Am I correct?
The regexp-based implementation only scans through
the HTML once. This
could well explain most of the performance
difference. Is there any way
to recode the HTMLParser-based implementation to
do the job in a single
scan?
How to reproduce the test: - Get Apache and JMeter running (I'm running both
on the same box, which
is probably a bad idea). - Uncompress the attached test-httpsamplerfull.tgz
in the Apache
docroot. It contains a Yahoo home page saved using
Mozilla 1.5. (A
proper test would use several other samples). - Run the attached script and look at the Rate in
the Aggregate Report.
On my IBM T30 with Pentium 4 M @ 2.2 GHz, 1 GB
RAM, with JDK 1.4.2_02,
no fiddling with the java arguments (yes, that
means I'm using -Xincgc,
which is probably the worst possible choice) I'm
getting around 1
sample/second with the HTPMLParser-based sampler
and around 7
sample/second with the Regexp-based one.
In addition, the HTMLParser-based implementation
is failing to download
two images: powrdbyhp_blu_84x28_yahoo.gif (it is
downloading the HTML
page again instead) and 031121_l300.gif (it
downloads nothing). I've
used Mozilla's "Live HTTP Headers" to see what
Mozilla does and it
matches what the Regexp-based implementation is
doing. I'd say there's a
bug in the HTMLParser. Can someone familiar with
it have a look? (Hi
Peter!).
propType="org.apache.jmeter.testelement.property.StringProperty"<?xml version="1.0" encoding="UTF-8"?> <node> <testelement
class="org.apache.jmeter.testelement.TestPlan">
<testelement
class="org.apache.jmeter.config.Arguments" name="TestPlan.user_defined_variables">
<property xml:space="preserve"
name="TestElement.gui_class">org.apache.jmeter.config.gui.ArgumentsPanel</property>
propType="org.apache.jmeter.testelement.property.StringProperty"<property xml:space="preserve"
name="TestElement.test_class">org.apache.jmeter.config.Arguments</property>
propType="org.apache.jmeter.testelement.property.CollectionProperty"<collection class="java.util.ArrayList"
name="Arguments.arguments"/>propType="org.apache.jmeter.testelement.property.StringProperty"
<property xml:space="preserve"
name="TestElement.name">Argument List</property>propType="org.apache.jmeter.testelement.property.BooleanProperty"
<property xml:space="preserve"
name="TestElement.enabled">true</property>propType="org.apache.jmeter.testelement.property.StringProperty"
</testelement> <property xml:space="preserve"
name="TestElement.gui_class">org.apache.jmeter.control.gui.TestPlanGui</property>
propType="org.apache.jmeter.testelement.property.CollectionProperty"<collection class="java.util.LinkedList"
name="TestPlan.thread_groups"/>propType="org.apache.jmeter.testelement.property.StringProperty"
<property xml:space="preserve"
name="TestElement.test_class">org.apache.jmeter.testelement.TestPlan</property>
propType="org.apache.jmeter.testelement.property.BooleanProperty"<property xml:space="preserve"
name="TestPlan.serialize_threadgroups">false</property>
propType="org.apache.jmeter.testelement.property.StringProperty"<property xml:space="preserve"
name="TestElement.name">Test Plan</property>propType="org.apache.jmeter.testelement.property.BooleanProperty"
<property xml:space="preserve"
name="TestElement.enabled">true</property>propType="org.apache.jmeter.testelement.property.BooleanProperty"
<property xml:space="preserve"
name="TestPlan.functional_mode">false</property>propType="org.apache.jmeter.testelement.property.StringProperty"
</testelement> <node> <testelement
class="org.apache.jmeter.threads.ThreadGroup">
<property xml:space="preserve"
name="TestElement.gui_class">org.apache.jmeter.threads.gui.ThreadGroupGui</property>
propType="org.apache.jmeter.testelement.property.LongProperty"<property xml:space="preserve"
name="ThreadGroup.start_time">0</property>propType="org.apache.jmeter.testelement.property.StringProperty"
<property xml:space="preserve"
name="TestElement.test_class">org.apache.jmeter.threads.ThreadGroup</property>
propType="org.apache.jmeter.testelement.property.StringProperty"<testelement
class="org.apache.jmeter.control.LoopController" name="ThreadGroup.main_controller">
<property xml:space="preserve"
name="TestElement.gui_class">org.apache.jmeter.control.gui.LoopControlPanel</property>
propType="org.apache.jmeter.testelement.property.IntegerProperty"<property xml:space="preserve"
name="LoopController.loops">-1</property>propType="org.apache.jmeter.testelement.property.StringProperty"
<property xml:space="preserve"
name="TestElement.test_class">org.apache.jmeter.control.LoopController</property>
propType="org.apache.jmeter.testelement.property.StringProperty"<property xml:space="preserve"
name="TestElement.name">Loop Controller</property>propType="org.apache.jmeter.testelement.property.BooleanProperty"
<property xml:space="preserve"
name="TestElement.enabled">true</property>propType="org.apache.jmeter.testelement.property.BooleanProperty"
<property xml:space="preserve"
name="LoopController.continue_forever">false</property>
=== message truncated ===</testelement>
__________________________________ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
