Achim, On Wed, May 6, 2009 at 7:08 AM, Achim Hoffmann <a...@securenet.de> wrote: > Hi Andres, > > got a buggy charset in a meta tag, which forced following: > > ----------- > [ 05/06/09 11:09:40 - debug ] GET https://some.tld/some.do?WSDL returned HTTP > code "204" > [ 05/06/09 11:09:40 - debug ] Unhandled exception in xUrllib._send(): unknown > encoding: utf-81 > [ 05/06/09 11:09:40 - debug ] Traceback (most recent call last): > [ 05/06/09 11:09:40 - debug ] File > "D:\Programme\w3af\core\data\url\xUrllib.py", line 468, in _send > [ 05/06/09 11:09:40 - debug ] res = self._cacheOpener.open( req ) > [ 05/06/09 11:09:40 - debug ] File "C:\Python25\lib\urllib2.py", line 387, > in open > [ 05/06/09 11:09:40 - debug ] response = meth(req, response) > [ 05/06/09 11:09:40 - debug ] File > "D:\Programme\w3af\core\data\url\handlers\logHandler.py", line 159, in > http_response > [ 05/06/09 11:09:40 - debug ] self._log_request_response( request, > response ) > [ 05/06/09 11:09:40 - debug ] File > "D:\Programme\w3af\core\data\url\handlers\logHandler.py", line 154, in > _log_request_response > [ 05/06/09 11:09:40 - debug ] res = httpResponse.httpResponse( code, > body, hdrs, url, url, msg=msg, id=id) > [ 05/06/09 11:09:40 - debug ] File > "D:\Programme\w3af\core\data\url\httpResponse.py", line 58, in __init__ > [ 05/06/09 11:09:40 - debug ] self.setBody(read) > [ 05/06/09 11:09:40 - debug ] File > "D:\Programme\w3af\core\data\url\httpResponse.py", line 106, in setBody > [ 05/06/09 11:09:40 - debug ] self._charset_handling(body) > [ 05/06/09 11:09:40 - debug ] File > "D:\Programme\w3af\core\data\url\httpResponse.py", line 178, in > _charset_handling > [ 05/06/09 11:09:40 - debug ] unicode_str = body.decode(charset, > 'returnEscapedChar') > [ 05/06/09 11:09:40 - debug ] LookupError: unknown encoding: utf-81 > ----------- > > fixed it temporarily as follows: w3af\core\data\url\httpResponse.py > > # Achim { > # Achim # unicode_str = body.decode(charset, 'returnEscapedChar') > try: > unicode_str = body.decode(charset, 'returnEscapedChar') > except LookupError: > om.out.error('charset Lookup error: unknown charset: ' + > charset + '; ignored and set to default: ' + self._charset); > unicode_str = self._charset > except Exception, e: > om.out.error( str(e) ) > # Achim } > > hope this helps to handle broken pages ;-)
Excellent bug report, I fixed it in 1 minute =) There was only one simple problem with your solution, the "unicode_str = self._charset" makes all responses with unknown encoding be "UTF-8"; I did it this way: # Now that we have the charset, we use it! (and save it) # The return value of the decode function is a unicode string. try: unicode_str = body.decode(charset, 'returnEscapedChar') except LookupError: # warn about a buggy charset msg = 'Charset LookupError: unknown charset: ' + charset msg += '; ignored and set to default: ' + self._charset om.out.debug( msg ) # Use the default unicode_str = body.decode(self._charset, 'returnEscapedChar') Thank you very much! Please keep reporting bugs =) > Achim > > > ------------------------------------------------------------------------------ > The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your > production scanning environment may not be a perfect world - but thanks to > Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 > Series Scanner you'll get full speed at 300 dpi even with all image > processing features enabled. http://p.sf.net/sfu/kodak-com > _______________________________________________ > W3af-develop mailing list > W3af-develop@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/w3af-develop > -- Andrés Riancho http://www.bonsai-sec.com/ http://w3af.sourceforge.net/ ------------------------------------------------------------------------------ The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com _______________________________________________ W3af-develop mailing list W3af-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/w3af-develop