can you verify if the old JTidy implementation contains the same bug? I'm going to guess it's how I'm using htmlparser.
peter --- Jordi Salvat i Alabart <[EMAIL PROTECTED]> wrote: > Responding to myself again... > > I've been running some more tests with JVM arguments > that I believe more > sensible, namely: > > -Xms256m -Xmx256m -XX:NewSize=64m -XX:MaxNewSize=64m > > -XX:MaxLiveObjectEvacuationRatio=40 > -XX:SurvivorRatio=8 > > With this, the performance difference has almost > disappeared: I'm > getting ca. 12 sample/second with the htmlparser, 15 > sample/second with > the regexp approach. The htmlparser solution > generates about 5 times > more garbage than the regexp solution -- which > explains why the results > were so tremendously different using -Xincgc. > > In this situation, I don't believe it's worth > providing users with the > ability to choose which parser they want. I won't > remove them now, but I > believe HtmlParser is the best choice,... once we'll > have managed to > clean the outstanding bugs. > > The bugs I mentioned before (failure to parse a > couple of image URLs) > still hold. I'll file them now. > > -- > Salut, > > Jordi. > > En/na Jordi Salvat i Alabart ha escrit: > > Hi. > > > > I've finally found some time to test the > performance of the > > HTTPSamplerFull implementation currently in CVS > (developped by Peter Lin > > using HTMLParser) against the implementation I > sent a while ago to the > > list (developped by me using Regexps). [Remember: > the objective is not > > to decide which is best, but whether it's worth > having both available to > > script developers]. > > > > The results are not conclusive, but they prove > that the issue deserves > > further analysis: > > > > 1/ On the example I've been using, the > Regexp-based implementation was > > more accurate than the HTMLParser-based one. This > is very surprising to > > me, since I expected the Regexp-based > implementation to be generally > > less accurate. I'll need some help on this one. > More details later. > > > > 2/ On the example I've been using, the > Regexp-based implementation was > > at least 7 times faster than the HTTPParser-based > one. A quick look at > > the code suggests that the HTML Parser is being > called 5 times (one for > > each tag of interest: img, applet, input, body, > table). Am I correct? > > The regexp-based implementation only scans through > the HTML once. This > > could well explain most of the performance > difference. Is there any way > > to recode the HTMLParser-based implementation to > do the job in a single > > scan? > > > > How to reproduce the test: > > - Get Apache and JMeter running (I'm running both > on the same box, which > > is probably a bad idea). > > - Uncompress the attached test-httpsamplerfull.tgz > in the Apache > > docroot. It contains a Yahoo home page saved using > Mozilla 1.5. (A > > proper test would use several other samples). > > - Run the attached script and look at the Rate in > the Aggregate Report. > > > > On my IBM T30 with Pentium 4 M @ 2.2 GHz, 1 GB > RAM, with JDK 1.4.2_02, > > no fiddling with the java arguments (yes, that > means I'm using -Xincgc, > > which is probably the worst possible choice) I'm > getting around 1 > > sample/second with the HTPMLParser-based sampler > and around 7 > > sample/second with the Regexp-based one. > > > > In addition, the HTMLParser-based implementation > is failing to download > > two images: powrdbyhp_blu_84x28_yahoo.gif (it is > downloading the HTML > > page again instead) and 031121_l300.gif (it > downloads nothing). I've > > used Mozilla's "Live HTTP Headers" to see what > Mozilla does and it > > matches what the Regexp-based implementation is > doing. I'd say there's a > > bug in the HTMLParser. Can someone familiar with > it have a look? (Hi > > Peter!). > > > > > > > ------------------------------------------------------------------------ > > > > <?xml version="1.0" encoding="UTF-8"?> > > <node> > > <testelement > class="org.apache.jmeter.testelement.TestPlan"> > > <testelement > class="org.apache.jmeter.config.Arguments" > name="TestPlan.user_defined_variables"> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.gui_class">org.apache.jmeter.config.gui.ArgumentsPanel</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.test_class">org.apache.jmeter.config.Arguments</property> > > <collection class="java.util.ArrayList" > propType="org.apache.jmeter.testelement.property.CollectionProperty" > name="Arguments.arguments"/> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.name">Argument List</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.BooleanProperty" > name="TestElement.enabled">true</property> > > </testelement> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.gui_class">org.apache.jmeter.control.gui.TestPlanGui</property> > > <collection class="java.util.LinkedList" > propType="org.apache.jmeter.testelement.property.CollectionProperty" > name="TestPlan.thread_groups"/> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.test_class">org.apache.jmeter.testelement.TestPlan</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.BooleanProperty" > name="TestPlan.serialize_threadgroups">false</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.name">Test Plan</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.BooleanProperty" > name="TestElement.enabled">true</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.BooleanProperty" > name="TestPlan.functional_mode">false</property> > > </testelement> > > <node> > > <testelement > class="org.apache.jmeter.threads.ThreadGroup"> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.gui_class">org.apache.jmeter.threads.gui.ThreadGroupGui</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.LongProperty" > name="ThreadGroup.start_time">0</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.test_class">org.apache.jmeter.threads.ThreadGroup</property> > > <testelement > class="org.apache.jmeter.control.LoopController" > name="ThreadGroup.main_controller"> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.gui_class">org.apache.jmeter.control.gui.LoopControlPanel</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.IntegerProperty" > name="LoopController.loops">-1</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.test_class">org.apache.jmeter.control.LoopController</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.StringProperty" > name="TestElement.name">Loop Controller</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.BooleanProperty" > name="TestElement.enabled">true</property> > > <property xml:space="preserve" > propType="org.apache.jmeter.testelement.property.BooleanProperty" > name="LoopController.continue_forever">false</property> > > </testelement> > === message truncated === __________________________________ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
