can you verify if the old JTidy implementation
contains the same bug?

I'm going to guess it's how I'm using htmlparser.

peter


--- Jordi Salvat i Alabart <[EMAIL PROTECTED]> wrote:
> Responding to myself again...
> 
> I've been running some more tests with JVM arguments
> that I believe more 
> sensible, namely:
> 
> -Xms256m -Xmx256m -XX:NewSize=64m -XX:MaxNewSize=64m
> 
> -XX:MaxLiveObjectEvacuationRatio=40
> -XX:SurvivorRatio=8
> 
> With this, the performance difference has almost
> disappeared: I'm 
> getting ca. 12 sample/second with the htmlparser, 15
> sample/second with 
> the regexp approach. The htmlparser solution
> generates about 5 times 
> more garbage than the regexp solution -- which
> explains why the results 
> were so tremendously different using -Xincgc.
> 
> In this situation, I don't believe it's worth
> providing users with the 
> ability to choose which parser they want. I won't
> remove them now, but I 
> believe HtmlParser is the best choice,... once we'll
> have managed to 
> clean the outstanding bugs.
> 
> The bugs I mentioned before (failure to parse a
> couple of image URLs) 
> still hold. I'll file them now.
> 
> -- 
> Salut,
> 
> Jordi.
> 
> En/na Jordi Salvat i Alabart ha escrit:
> > Hi.
> > 
> > I've finally found some time to test the
> performance of the 
> > HTTPSamplerFull implementation currently in CVS
> (developped by Peter Lin 
> > using HTMLParser) against the implementation I
> sent a while ago to the 
> > list (developped by me using Regexps). [Remember:
> the objective is not 
> > to decide which is best, but whether it's worth
> having both available to 
> > script developers].
> > 
> > The results are not conclusive, but they prove
> that the issue deserves 
> > further analysis:
> > 
> > 1/ On the example I've been using, the
> Regexp-based implementation was 
> > more accurate than the HTMLParser-based one. This
> is very surprising to 
> > me, since I expected the Regexp-based
> implementation to be generally 
> > less accurate. I'll need some help on this one.
> More details later.
> > 
> > 2/ On the example I've been using, the
> Regexp-based implementation was 
> > at least 7 times faster than the HTTPParser-based
> one. A quick look at 
> > the code suggests that the HTML Parser is being
> called 5 times (one for 
> > each tag of interest: img, applet, input, body,
> table). Am I correct? 
> > The regexp-based implementation only scans through
> the HTML once. This 
> > could well explain most of the performance
> difference. Is there any way 
> > to recode the HTMLParser-based implementation to
> do the job in a single 
> > scan?
> > 
> > How to reproduce the test:
> > - Get Apache and JMeter running (I'm running both
> on the same box, which 
> > is probably a bad idea).
> > - Uncompress the attached test-httpsamplerfull.tgz
> in the Apache 
> > docroot. It contains a Yahoo home page saved using
> Mozilla 1.5. (A 
> > proper test would use several other samples).
> > - Run the attached script and look at the Rate in
> the Aggregate Report.
> > 
> > On my IBM T30 with Pentium 4 M @ 2.2 GHz, 1 GB
> RAM, with JDK 1.4.2_02, 
> > no fiddling with the java arguments (yes, that
> means I'm using -Xincgc, 
> > which is probably the worst possible choice) I'm
> getting around 1 
> > sample/second with the HTPMLParser-based sampler
> and around 7 
> > sample/second with the Regexp-based one.
> > 
> > In addition, the HTMLParser-based implementation
> is failing to download 
> > two images: powrdbyhp_blu_84x28_yahoo.gif (it is
> downloading the HTML 
> > page again instead) and 031121_l300.gif (it
> downloads nothing). I've 
> > used Mozilla's "Live HTTP Headers" to see what
> Mozilla does and it 
> > matches what the Regexp-based implementation is
> doing. I'd say there's a 
> > bug in the HTMLParser. Can someone familiar with
> it have a look? (Hi 
> > Peter!).
> > 
> > 
> >
>
------------------------------------------------------------------------
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <node>
> > <testelement
> class="org.apache.jmeter.testelement.TestPlan">
> > <testelement
> class="org.apache.jmeter.config.Arguments"
> name="TestPlan.user_defined_variables">
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
>
name="TestElement.gui_class">org.apache.jmeter.config.gui.ArgumentsPanel</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
>
name="TestElement.test_class">org.apache.jmeter.config.Arguments</property>
> > <collection class="java.util.ArrayList"
>
propType="org.apache.jmeter.testelement.property.CollectionProperty"
> name="Arguments.arguments"/>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
> name="TestElement.name">Argument List</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.BooleanProperty"
> name="TestElement.enabled">true</property>
> > </testelement>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
>
name="TestElement.gui_class">org.apache.jmeter.control.gui.TestPlanGui</property>
> > <collection class="java.util.LinkedList"
>
propType="org.apache.jmeter.testelement.property.CollectionProperty"
> name="TestPlan.thread_groups"/>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
>
name="TestElement.test_class">org.apache.jmeter.testelement.TestPlan</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.BooleanProperty"
>
name="TestPlan.serialize_threadgroups">false</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
> name="TestElement.name">Test Plan</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.BooleanProperty"
> name="TestElement.enabled">true</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.BooleanProperty"
> name="TestPlan.functional_mode">false</property>
> > </testelement>
> > <node>
> > <testelement
> class="org.apache.jmeter.threads.ThreadGroup">
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
>
name="TestElement.gui_class">org.apache.jmeter.threads.gui.ThreadGroupGui</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.LongProperty"
> name="ThreadGroup.start_time">0</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
>
name="TestElement.test_class">org.apache.jmeter.threads.ThreadGroup</property>
> > <testelement
> class="org.apache.jmeter.control.LoopController"
> name="ThreadGroup.main_controller">
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
>
name="TestElement.gui_class">org.apache.jmeter.control.gui.LoopControlPanel</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.IntegerProperty"
> name="LoopController.loops">-1</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
>
name="TestElement.test_class">org.apache.jmeter.control.LoopController</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.StringProperty"
> name="TestElement.name">Loop Controller</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.BooleanProperty"
> name="TestElement.enabled">true</property>
> > <property xml:space="preserve"
>
propType="org.apache.jmeter.testelement.property.BooleanProperty"
>
name="LoopController.continue_forever">false</property>
> > </testelement>
> 
=== message truncated ===


__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to