Adi,
Thanks for reporting this bug, it is an issue in the way libxml2
parses HTML responses (which is not perfect) and we were not handling
the exception. I've added a better exception handling routine inside
our parser, which will allow you to run the scan without running into
this issue. The only problem with this fix is that libxml2 will still
fail to parse that HTTP response body, so it might be the case that
w3af misses some links in the application.
The fix is in revision 4511 from our SVN.
Regards,
On Sat, Dec 3, 2011 at 11:55 AM, Adi Mutu <[email protected]> wrote:
> Hello,
>
> this is what i get after choosing profile, selecting target and start:
>
>
> w3af>>> start
> Exiting setOutputPlugins()
> Called w3afCore.start()
> Called buildOpeners
> keepalive: added one connection, len(self._hostmap["xxxxxxxxxxxx.com"]): 1
> DNS response from DNS server for domain: xxxxxxxxxxxx.com
> GET http://xxxxxxxxxxxx.com/ returned HTTP code "200" - id: 1
> Starting "httpAuthDetect" grep_worker for response: <httpResponse | 200 |
> http://xxxxxxxxxxxx.com/ | id:1>
> Error in grep plugin, "httpAuthDetect" raised the exception: Element script
> embeds close tag, line 241, column 60. Please report this bug to the w3af
> sourceforge project page [ https://sourceforge.net/apps/trac/w3af/newticket
> ]
> Exception: Traceback (most recent call last):
> File "/opt/. /w3af/core/data/url/xUrllib.py", line 840, in _grep_worker
> timedout_grep_wrapper(request, response)
> XMLSyntaxError: Element script embeds close tag, line 241, column 60
>
> Traceback (most recent call last):
> File "/opt/. /w3af/core/controllers/misc/timeout_function.py", line 76,
> in run
> self._result_ = function(*args, **kwds)
> File "/opt/. /w3af/core/controllers/basePlugin/baseGrepPlugin.py", line
> 61, in grep_wrapper
> self.grep(fuzzableRequest, response)
> File "/opt/. /w3af/plugins/grep/httpAuthDetect.py", line 151, in grep
> self._find_auth_uri(response)
> File "/opt/. /w3af/plugins/grep/httpAuthDetect.py", line 186, in
> _find_auth_uri
> documentParser = dpCache.dpc.getDocumentParserFor(response)
> File "/opt/. /w3af/core/data/parsers/dpCache.py", line 69, in
> getDocumentParserFor
> res = documentParser.documentParser(httpResponse)
> File "/opt/. /w3af/core/data/parsers/documentParser.py", line 54, in
> __init__
> parser = htmlParser.HTMLParser(httpResponse)
> File "/opt/. /w3af/core/data/parsers/htmlParser.py", line 51, in
> __init__
> SGMLParser.__init__(self, http_resp)
> File "/opt/. /w3af/core/data/parsers/sgmlParser.py", line 73, in
> __init__
> self._parse(http_resp)
> File "/opt/. /w3af/core/data/parsers/sgmlParser.py", line 131, in _parse
> etree.fromstring(resp_body, parser)
> File "lxml.etree.pyx", line 2377, in lxml.etree.fromstring
> (src/lxml/lxml.etree.c:21156)
> File "parser.pxi", line 1354, in lxml.etree._parseMemoryDocument
> (src/lxml/lxml.etree.c:53514)
> File "parser.pxi", line 1239, in lxml.etree._parseDoc
> (src/lxml/lxml.etree.c:52487)
> File "parser.pxi", line 759, in lxml.etree._BaseParser._parseUnicodeDoc
> (src/lxml/lxml.etree.c:49608)
> File "parsertarget.pxi", line 130, in
> lxml.etree._TargetParserContext._handleParseResultDoc
> (src/lxml/lxml.etree.c:58561)
> File "parser.pxi", line 478, in lxml.etree._raiseParseError
> (src/lxml/lxml.etree.c:47285)
> XMLSyntaxError: Element script embeds close tag, line 241, column 60
>
> Finished grep_worker for response: <httpResponse | 200 |
> http://xxxxxxxxxxxx.com/ | id:1>
> Starting "error500" grep_worker for response: <httpResponse | 200 |
> http://xxxxxxxxxxxx.com/ | id:1>
> Finished grep_worker for response: <httpResponse | 200 |
> http://xxxxxxxxxxxx.com/ | id:1>
> The target URL: http://xxxxxxxxxxxx.com/ is unreachable because of an
> unhandled exception.
> Error description: "Element script embeds close tag, line 241, column 60".
> See debug output for more information.
> Traceback for this error: Traceback (most recent call last):
> File "/opt/. /w3af/core/controllers/w3afCore.py", line 511, in
> _realStart
> get_curr_scope_pages, createFuzzableRequests(response))
> File "/opt/. /w3af/core/data/request/frFactory.py", line 78, in
> createFuzzableRequests
> dp = dpCache.dpc.getDocumentParserFor(http_resp)
> File "/opt/. /w3af/core/data/parsers/dpCache.py", line 69, in
> getDocumentParserFor
> res = documentParser.documentParser(httpResponse)
> File "/opt/. /w3af/core/data/parsers/documentParser.py", line 54, in
> __init__
> parser = htmlParser.HTMLParser(httpResponse)
> File "/opt/. /w3af/core/data/parsers/htmlParser.py", line 51, in
> __init__
> SGMLParser.__init__(self, http_resp)
> File "/opt/. /w3af/core/data/parsers/sgmlParser.py", line 73, in
> __init__
> self._parse(http_resp)
> File "/opt/. /w3af/core/data/parsers/sgmlParser.py", line 131, in _parse
> etree.fromstring(resp_body, parser)
> File "lxml.etree.pyx", line 2377, in lxml.etree.fromstring
> (src/lxml/lxml.etree.c:21156)
> File "parser.pxi", line 1354, in lxml.etree._parseMemoryDocument
> (src/lxml/lxml.etree.c:53514)
> File "parser.pxi", line 1239, in lxml.etree._parseDoc
> (src/lxml/lxml.etree.c:52487)
> File "parser.pxi", line 759, in lxml.etree._BaseParser._parseUnicodeDoc
> (src/lxml/lxml.etree.c:49608)
> File "parsertarget.pxi", line 130, in
> lxml.etree._TargetParserContext._handleParseResultDoc
> (src/lxml/lxml.etree.c:58561)
> File "parser.pxi", line 478, in lxml.etree._raiseParseError
> (src/lxml/lxml.etree.c:47285)
> XMLSyntaxError: Element script embeds close tag, line 241, column 60
>
> Called _discoverWorker()
> Called _bruteforce()
> No URLs found by discovery.
> Cleared urllib2 local cache.
> Enabling _dnsCache()
> Calling join on all daemon threads
> Scan finished in 2 seconds.
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
> _______________________________________________
> W3af-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/w3af-users
>
--
Andrés Riancho
Director of Web Security at Rapid7 LLC
Founder at Bonsai Information Security
Project Leader at w3af
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
W3af-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/w3af-users