Re: [W3af-develop] [W3af-users] Working on grep plugin performance

Taras Wed, 18 Aug 2010 04:12:20 -0700

Andres,

Tactic to increase performance of grep plugins by changing parsing engine works 
only for 
plugins in which such parsed (not plain) data is reqested (by calling 
getDocumentParserFor()), isn't it?


But in same time we have only 4 such plugins:
$grep -R getDocumentParserFor *.py 
findComments.py:                    dp = dpCache.dpc.getDocumentParserFor( 
response )
getMails.py:            dp = dpCache.dpc.getDocumentParserFor( response )
metaTags.py:                    dp = dpCache.dpc.getDocumentParserFor( response 
)
strangeParameters.py:            dp = dpCache.dpc.getDocumentParserFor( 
response )

So is it enought count (4 plugins) to change parsing engine?

>     I've been working on the performance of the grep plugins, I
> basically found that some of them used regular expressions heavily and
> those regular expressions were far from being fast. After some hours
> of trying to enhance the performance of each particular regex, I
> decided to move on and change the tactic. I tried the following:
> 
> 1- Load the HTML into a xml.minidom
> 2- Load the HTML into a BeautifulSoup
> 3- Load the HTML into a libxml2
> 
>     The first one was fast, but... it failed to parse broken HTML. The
> second one was GREAT at handling broken HTML, but made my tests run in
> DOUBLE the time! Finally, libxml2 gave us a good balance between speed
> and broken HTML handling. With #3 I reduced my test time from 10sec to
> 4sec. The attached file shows the functions that consume the most CPU
> time. Tomorrow I'll be working on enhancing the grep plugins even
> more, if you want to help, please join the #w3af IRC channel, and
> we'll work together!


-- 
Taras
http://oxdef.info

------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
W3af-develop mailing list
W3af-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/w3af-develop

Re: [W3af-develop] [W3af-users] Working on grep plugin performance

Reply via email to