Achim,

On Wed, May 6, 2009 at 7:08 AM, Achim Hoffmann <a...@securenet.de> wrote:
> Hi Andres,
>
> got a buggy charset in a meta tag, which forced following:
>
> -----------
> [ 05/06/09 11:09:40 - debug ] GET https://some.tld/some.do?WSDL returned HTTP 
> code "204"
> [ 05/06/09 11:09:40 - debug ] Unhandled exception in xUrllib._send(): unknown 
> encoding: utf-81
> [ 05/06/09 11:09:40 - debug ] Traceback (most recent call last):
> [ 05/06/09 11:09:40 - debug ]   File 
> "D:\Programme\w3af\core\data\url\xUrllib.py", line 468, in _send
> [ 05/06/09 11:09:40 - debug ]     res = self._cacheOpener.open( req )
> [ 05/06/09 11:09:40 - debug ]   File "C:\Python25\lib\urllib2.py", line 387, 
> in open
> [ 05/06/09 11:09:40 - debug ]     response = meth(req, response)
> [ 05/06/09 11:09:40 - debug ]   File 
> "D:\Programme\w3af\core\data\url\handlers\logHandler.py", line 159, in 
> http_response
> [ 05/06/09 11:09:40 - debug ]     self._log_request_response( request, 
> response )
> [ 05/06/09 11:09:40 - debug ]   File 
> "D:\Programme\w3af\core\data\url\handlers\logHandler.py", line 154, in 
> _log_request_response
> [ 05/06/09 11:09:40 - debug ]     res = httpResponse.httpResponse( code, 
> body, hdrs, url, url, msg=msg, id=id)
> [ 05/06/09 11:09:40 - debug ]   File 
> "D:\Programme\w3af\core\data\url\httpResponse.py", line 58, in __init__
> [ 05/06/09 11:09:40 - debug ]     self.setBody(read)
> [ 05/06/09 11:09:40 - debug ]   File 
> "D:\Programme\w3af\core\data\url\httpResponse.py", line 106, in setBody
> [ 05/06/09 11:09:40 - debug ]     self._charset_handling(body)
> [ 05/06/09 11:09:40 - debug ]   File 
> "D:\Programme\w3af\core\data\url\httpResponse.py", line 178, in 
> _charset_handling
> [ 05/06/09 11:09:40 - debug ]     unicode_str = body.decode(charset, 
> 'returnEscapedChar')
> [ 05/06/09 11:09:40 - debug ] LookupError: unknown encoding: utf-81
>  -----------
>
> fixed it temporarily as follows: w3af\core\data\url\httpResponse.py
>
> # Achim {
> # Achim #       unicode_str = body.decode(charset, 'returnEscapedChar')
>                 try:
>                     unicode_str = body.decode(charset, 'returnEscapedChar')
>                 except LookupError:
>                     om.out.error('charset Lookup error: unknown charset: ' + 
> charset + '; ignored and set to default: ' + self._charset);
>                     unicode_str = self._charset
>                 except Exception, e:
>                     om.out.error( str(e) )
> # Achim }
>
> hope this helps to handle broken pages ;-)

Excellent bug report, I fixed it in 1 minute =)
There was only one simple problem with your solution, the "unicode_str
= self._charset" makes all responses with unknown encoding be "UTF-8";
I did it this way:

                # Now that we have the charset, we use it! (and save it)
                # The return value of the decode function is a unicode string.
                try:
                    unicode_str = body.decode(charset, 'returnEscapedChar')
                except LookupError:
                    # warn about a buggy charset
                    msg = 'Charset LookupError: unknown charset: ' + charset
                    msg += '; ignored and set to default: ' + self._charset
                    om.out.debug( msg )

                    # Use the default
                    unicode_str = body.decode(self._charset,
'returnEscapedChar')

Thank you very much! Please keep reporting bugs =)

> Achim
>
>
> ------------------------------------------------------------------------------
> The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
> production scanning environment may not be a perfect world - but thanks to
> Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
> Series Scanner you'll get full speed at 300 dpi even with all image
> processing features enabled. http://p.sf.net/sfu/kodak-com
> _______________________________________________
> W3af-develop mailing list
> W3af-develop@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/w3af-develop
>



-- 
Andrés Riancho
http://www.bonsai-sec.com/
http://w3af.sourceforge.net/

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
W3af-develop mailing list
W3af-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/w3af-develop

Reply via email to