Andres, Tactic to increase performance of grep plugins by changing parsing engine works only for plugins in which such parsed (not plain) data is reqested (by calling getDocumentParserFor()), isn't it?
But in same time we have only 4 such plugins: $grep -R getDocumentParserFor *.py findComments.py: dp = dpCache.dpc.getDocumentParserFor( response ) getMails.py: dp = dpCache.dpc.getDocumentParserFor( response ) metaTags.py: dp = dpCache.dpc.getDocumentParserFor( response ) strangeParameters.py: dp = dpCache.dpc.getDocumentParserFor( response ) So is it enought count (4 plugins) to change parsing engine? > I've been working on the performance of the grep plugins, I > basically found that some of them used regular expressions heavily and > those regular expressions were far from being fast. After some hours > of trying to enhance the performance of each particular regex, I > decided to move on and change the tactic. I tried the following: > > 1- Load the HTML into a xml.minidom > 2- Load the HTML into a BeautifulSoup > 3- Load the HTML into a libxml2 > > The first one was fast, but... it failed to parse broken HTML. The > second one was GREAT at handling broken HTML, but made my tests run in > DOUBLE the time! Finally, libxml2 gave us a good balance between speed > and broken HTML handling. With #3 I reduced my test time from 10sec to > 4sec. The attached file shows the functions that consume the most CPU > time. Tomorrow I'll be working on enhancing the grep plugins even > more, if you want to help, please join the #w3af IRC channel, and > we'll work together! -- Taras http://oxdef.info ------------------------------------------------------------------------------ This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev _______________________________________________ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop