Adi,

    Yes, that message is fine. There is no easy way for me to fix
libxml2's way of handling parsing errors. At least now you're able to
keep scanning the target application and don't get a crash but that
HTML (http://xxxxxxxxx.com/index.php) will be *partially parsed*.

Regards,

On Sun, Dec 4, 2011 at 3:03 PM, Adi Mutu <[email protected]> wrote:
> hah, funny, only now i saw that option in trac:)
>
> Anyway i get output similar to this:
>
> GET http://xxxxxxxxx.com/index.php?pg=5"; OR "99"="99 returned HTTP code
> "200" - id: 37689
> Starting "error500" grep_worker for response: <httpResponse | 200 |
> http://xxxxxxxxx.com/index.php | id:37689>
> Finished grep_worker for response: <httpResponse | 200 |
> http://xxxxxxxxx.com/index.php | id:37689>
> Starting "httpAuthDetect" grep_worker for response: <httpResponse | 200 |
> http://xxxxxxxxx.com/index.php | id:37689>
> An error occurred while parsing "http://xxxxxxxxx.com/index.php";, original
> exception: "<class 'lxml.etree.XMLSyntaxError'>"
> Finished grep_worker for response: <httpResponse | 200 |
> http://xxxxxxxxx.com/index.php | id:37689>
> keepalive: removed one connection, len(self._hostmap["xxxxxxxxx.com"]): 19
> keepalive: replacing bad connection with a new one
>
>
> Is this ok for that  'An error occurred while parsing
> "http://xxxxxxxxx.com/index.php";, original exception: "<class
> 'lxml.etree.XMLSyntaxError'>"
>
> Thanks,
>
> '
>
> ________________________________
> From: Andres Riancho <[email protected]>
> To: Adi Mutu <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Sent: Sunday, December 4, 2011 5:29 PM
>
> Subject: Re: [W3af-users] w3af_console breaks at start
>
> http://sourceforge.net/apps/trac/w3af/changeset/4511 was another option :)
>
> On Sun, Dec 4, 2011 at 9:19 AM, Adi Mutu <[email protected]> wrote:
>> nevermind the last email, i've discovered svn diff -r :)
>>
>> ________________________________
>> From: Adi Mutu <[email protected]>
>> To: Andres Riancho <[email protected]>
>> Cc: "[email protected]" <[email protected]>
>> Sent: Sunday, December 4, 2011 11:46 AM
>>
>> Subject: Re: [W3af-users] w3af_console breaks at start
>>
>> Hi Andres,
>>
>> Can you tell me how can i see just the patch? I've tried using your track
>> but failed..
>> I'm interested because i want to learn python...I'be also looked trough
>> your
>> latest WP plugin about path disclosure and understand most of it.
>>
>> Cheers,
>>
>>
>> ________________________________
>> From: Andres Riancho <[email protected]>
>> To: Adi Mutu <[email protected]>
>> Cc: "[email protected]" <[email protected]>
>> Sent: Saturday, December 3, 2011 9:02 PM
>> Subject: Re: [W3af-users] w3af_console breaks at start
>>
>> Adi,
>>
>>     Thanks for reporting this bug, it is an issue in the way libxml2
>> parses HTML responses (which is not perfect) and we were not handling
>> the exception. I've added a better exception handling routine inside
>> our parser, which will allow you to run the scan without running into
>> this issue. The only problem with this fix is that libxml2 will still
>> fail to parse that HTTP response body, so it might be the case that
>> w3af misses some links in the application.
>>
>>     The fix is in revision 4511 from our SVN.
>>
>> Regards,
>>
>> On Sat, Dec 3, 2011 at 11:55 AM, Adi Mutu <[email protected]> wrote:
>>> Hello,
>>>
>>> this is what i get after choosing profile, selecting target and start:
>>>
>>>
>>> w3af>>> start
>>> Exiting setOutputPlugins()
>>> Called w3afCore.start()
>>> Called buildOpeners
>>> keepalive: added one connection, len(self._hostmap["xxxxxxxxxxxx.com"]):
>>> 1
>>> DNS response from DNS server for domain: xxxxxxxxxxxx.com
>>> GET http://xxxxxxxxxxxx.com/ returned HTTP code "200" - id: 1
>>> Starting "httpAuthDetect" grep_worker for response: <httpResponse | 200 |
>>> http://xxxxxxxxxxxx.com/ | id:1>
>>> Error in grep plugin, "httpAuthDetect" raised the exception: Element
>>> script
>>> embeds close tag, line 241, column 60. Please report this bug to the w3af
>>> sourceforge project page [
>>> https://sourceforge.net/apps/trac/w3af/newticket
>>> ]
>>> Exception: Traceback (most recent call last):
>>>   File "/opt/.   /w3af/core/data/url/xUrllib.py", line 840, in
>>> _grep_worker
>>>     timedout_grep_wrapper(request, response)
>>> XMLSyntaxError: Element script embeds close tag, line 241, column 60
>>>
>>> Traceback (most recent call last):
>>>   File "/opt/.   /w3af/core/controllers/misc/timeout_function.py", line
>>> 76,
>>> in run
>>>     self._result_ = function(*args, **kwds)
>>>   File "/opt/.   /w3af/core/controllers/basePlugin/baseGrepPlugin.py",
>>> line
>>> 61, in grep_wrapper
>>>     self.grep(fuzzableRequest, response)
>>>   File "/opt/.   /w3af/plugins/grep/httpAuthDetect.py", line 151, in grep
>>>     self._find_auth_uri(response)
>>>   File "/opt/.   /w3af/plugins/grep/httpAuthDetect.py", line 186, in
>>> _find_auth_uri
>>>     documentParser = dpCache.dpc.getDocumentParserFor(response)
>>>   File "/opt/.   /w3af/core/data/parsers/dpCache.py", line 69, in
>>> getDocumentParserFor
>>>     res = documentParser.documentParser(httpResponse)
>>>   File "/opt/.   /w3af/core/data/parsers/documentParser.py", line 54, in
>>> __init__
>>>     parser = htmlParser.HTMLParser(httpResponse)
>>>   File "/opt/.   /w3af/core/data/parsers/htmlParser.py", line 51, in
>>> __init__
>>>     SGMLParser.__init__(self, http_resp)
>>>   File "/opt/.   /w3af/core/data/parsers/sgmlParser.py", line 73, in
>>> __init__
>>>     self._parse(http_resp)
>>>   File "/opt/.   /w3af/core/data/parsers/sgmlParser.py", line 131, in
>>> _parse
>>>     etree.fromstring(resp_body, parser)
>>>   File "lxml.etree.pyx", line 2377, in lxml.etree.fromstring
>>> (src/lxml/lxml.etree.c:21156)
>>>   File "parser.pxi", line 1354, in lxml.etree._parseMemoryDocument
>>> (src/lxml/lxml.etree.c:53514)
>>>   File "parser.pxi", line 1239, in lxml.etree._parseDoc
>>> (src/lxml/lxml.etree.c:52487)
>>>   File "parser.pxi", line 759, in lxml.etree._BaseParser._parseUnicodeDoc
>>> (src/lxml/lxml.etree.c:49608)
>>>   File "parsertarget.pxi", line 130, in
>>> lxml.etree._TargetParserContext._handleParseResultDoc
>>> (src/lxml/lxml.etree.c:58561)
>>>   File "parser.pxi", line 478, in lxml.etree._raiseParseError
>>> (src/lxml/lxml.etree.c:47285)
>>> XMLSyntaxError: Element script embeds close tag, line 241, column 60
>>>
>>> Finished grep_worker for response: <httpResponse | 200 |
>>> http://xxxxxxxxxxxx.com/ | id:1>
>>> Starting "error500" grep_worker for response: <httpResponse | 200 |
>>> http://xxxxxxxxxxxx.com/ | id:1>
>>> Finished grep_worker for response: <httpResponse | 200 |
>>> http://xxxxxxxxxxxx.com/ | id:1>
>>> The target URL: http://xxxxxxxxxxxx.com/ is unreachable because of an
>>> unhandled exception.
>>> Error description: "Element script embeds close tag, line 241, column
>>> 60".
>>> See debug output for more information.
>>> Traceback for this error: Traceback (most recent call last):
>>>   File "/opt/.   /w3af/core/controllers/w3afCore.py", line 511, in
>>> _realStart
>>>     get_curr_scope_pages, createFuzzableRequests(response))
>>>   File "/opt/.   /w3af/core/data/request/frFactory.py", line 78, in
>>> createFuzzableRequests
>>>     dp = dpCache.dpc.getDocumentParserFor(http_resp)
>>>   File "/opt/.   /w3af/core/data/parsers/dpCache.py", line 69, in
>>> getDocumentParserFor
>>>     res = documentParser.documentParser(httpResponse)
>>>   File "/opt/.   /w3af/core/data/parsers/documentParser.py", line 54, in
>>> __init__
>>>     parser = htmlParser.HTMLParser(httpResponse)
>>>   File "/opt/.   /w3af/core/data/parsers/htmlParser.py", line 51, in
>>> __init__
>>>     SGMLParser.__init__(self, http_resp)
>>>   File "/opt/.   /w3af/core/data/parsers/sgmlParser.py", line 73, in
>>> __init__
>>>     self._parse(http_resp)
>>>   File "/opt/.   /w3af/core/data/parsers/sgmlParser.py", line 131, in
>>> _parse
>>>     etree.fromstring(resp_body, parser)
>>>   File "lxml.etree.pyx", line 2377, in lxml.etree.fromstring
>>> (src/lxml/lxml.etree.c:21156)
>>>   File "parser.pxi", line 1354, in lxml.etree._parseMemoryDocument
>>> (src/lxml/lxml.etree.c:53514)
>>>   File "parser.pxi", line 1239, in lxml.etree._parseDoc
>>> (src/lxml/lxml.etree.c:52487)
>>>   File "parser.pxi", line 759, in lxml.etree._BaseParser._parseUnicodeDoc
>>> (src/lxml/lxml.etree.c:49608)
>>>   File "parsertarget.pxi", line 130, in
>>> lxml.etree._TargetParserContext._handleParseResultDoc
>>> (src/lxml/lxml.etree.c:58561)
>>>   File "parser.pxi", line 478, in lxml.etree._raiseParseError
>>> (src/lxml/lxml.etree.c:47285)
>>> XMLSyntaxError: Element script embeds close tag, line 241, column 60
>>>
>>> Called _discoverWorker()
>>> Called _bruteforce()
>>> No URLs found by discovery.
>>> Cleared urllib2 local cache.
>>> Enabling _dnsCache()
>>> Calling join on all daemon threads
>>> Scan finished in 2 seconds.
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> All the data continuously generated in your IT infrastructure
>>> contains a definitive record of customers, application performance,
>>> security threats, fraudulent activity, and more. Splunk takes this
>>> data and makes sense of it. IT sense. And common sense.
>>> http://p.sf.net/sfu/splunk-novd2d
>>> _______________________________________________
>>> W3af-users mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/w3af-users
>>>
>>
>>
>>
>> --
>> Andrés Riancho
>> Director of Web Security at Rapid7 LLC
>> Founder at Bonsai Information Security
>> Project Leader at w3af
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure
>> contains a definitive record of customers, application performance,
>> security threats, fraudulent activity, and more. Splunk takes this
>> data and makes sense of it. IT sense. And common sense.
>> http://p.sf.net/sfu/splunk-novd2d
>> _______________________________________________
>> W3af-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/w3af-users
>>
>>
>
>
>
> --
> Andrés Riancho
> Director of Web Security at Rapid7 LLC
> Founder at Bonsai Information Security
> Project Leader at w3af
>
>



-- 
Andrés Riancho
Director of Web Security at Rapid7 LLC
Founder at Bonsai Information Security
Project Leader at w3af

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
W3af-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/w3af-users

Reply via email to