Nevertheless,

I just wanted to point out that not every library seems to properly 
validate/sanitize all the input:
(core/data/url/handlers/redirect.py)

        # fix a possible malformed URL
        urlparts = urlparse.urlparse(newurl)
        if not urlparts.path:
            urlparts = list(urlparts)
            urlparts[2] = "/"
        newurl = urlparse.urlunparse(urlparts)

urlparse for example, won't complain about a URL like 
"http://foobar.com:some_non_integer_input/foo...";.

>>> urlparse.urlparse("http://w3af.org:fooo:myhost.com/foo?bar=bla";).netloc
'w3af.org:fooo:myhost.com'
>>> urlparse.urlparse("http://w3af.org:fooo:myhost.com/foo?bar=bla";).hostname
'w3af.org'
>>> 

It will crash when you try to call the "port" attribute, but there is no type 
casting performed for the "netloc" attribute.
Worse: No validation is performed when you unparse the urlparse object.

So in the end, newurl could be --something---.
I'm kind of afraid of bugs like that, but this topic isn't related to the UTF-8 
stuff anymore...

Regards,
Daniel

Am 16.02.2012 um 15:35 schrieb Andres Riancho:

> Daniel,
> 
> On Thu, Feb 16, 2012 at 10:38 AM, Daniel Zulla
> <daniel.zu...@googlemail.com> wrote:
>>    All software has vulnerabilities, it's in their nature :)
>> 
>> 
>> Right.
>> 
>>    Don't really. As soon as the byte string enters w3af, the best
>> thing to do is to decode it using the best encoding available (the one
>> in Content-Encoding header, or some other we might have in the HTTP
>> response) and after that all the rest of w3af's code simply forgets
>> about encodings and uses the unicode string.
>> 
>> 
>> Cool.
>> 
>>    Vulnerable to what?
>> 
>> 
>> A forced crash. I can't see any validation of the incoming data. E.g.:
>> Is resp.code really an integer > 100 < 900.
> 
> That's because the validation is done in httplib, please see "    def
> _read_status(self):" in httplib.py. We use urllib2, which uses
> httplib, so we don't have to worry about that. The worse thing that
> can happen is that we get a BadStatusLine exception and we're handling
> those in our code in order to avoid crashes.
> 
>>    We're not assuming that, if the response is not HTTP then httplib,
>> or urllib, or urllib2 (don't really know which one) will fail and
>> raise an exception.
>> 
>> 
>> That's my point. I would like to be sure about that. Because, for example,
>> if there will be additional c++ based code in w3af one day, and there are
>> chances to bypass filters or to cause exceptions, a python exception could
>> turn into a really dangerous exploitable flaw in PyQt4 or Cython referenced
>> code really quickly.
> 
>    Could be, but we ARE doing proper error handling in xUrllib and httplib.py
> 
>>    Could you explain me a little bit more about this? I tried to
>> google for ChunkOfUnidentified or ChunkOfUnidentifiedData and found
>> nothing.
>> 
>> 
>> http://docs.python.org/release/3.0.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit
> 
> Quoting you: "Everything is a ChunkOfUnidentified data until it gets
> converted to a string. If it's a string, it's Unicode and everthing is
> fine. If not, everthing breaks immediately."
> 
> "Everything is a ChunkOfUnidentified data until it gets converted to a
> string. If it's a string, it's Unicode and everthing is fine." That's
> what we're doing now at w3af. We receive a string of bytes and convert
> it to a unicode string based on the encoding that was indicated by the
> HTTP response. In some cases we're having errors in the conversion
> (because of various reasons that would also happen in py3k), that's
> why we have those bugs.
> 
> "If not, everthing breaks immediately." We're trying to avoid that :)
> The problem is that if we use errors=ignore/replace we end up in a
> situation where we don't know about the errors and can't fix them.
> 
> PS: Please check how to properly answer emails inline so that it is
> then easier to answer back :)
> 
>> Regards,
>> Daniel
>> Am 16.02.2012 um 14:26 schrieb Andres Riancho:
>> 
>> Daniel,
>> 
>> On Thu, Feb 16, 2012 at 10:07 AM, Daniel Zulla
>> <daniel.zu...@googlemail.com> wrote:
>> 
>> I have analyzed some closed source vulnerability scanners, and audited open
>> source scanners like skipfish.
>> 
>> Some of them are ironically vulnerable. Somebody may create an apache2
>> module that recognizes attacks in order to force penetration testers'
>> software to crash (or worse, e.g. to execute arbitrary code).
>> 
>> 
>>    All software has vulnerabilities, it's in their nature :)
>> 
>> errors=ignore or errors=replace may be a nice way to go, but - here are my
>> two cents:
>> 
>> Treating HTTP Responses as an UnidentifiedChunkOfPossiblyMaliciousData" as
>> long as possible is definitely the right way to go.
>> 
>> 
>>    Don't really. As soon as the byte string enters w3af, the best
>> thing to do is to decode it using the best encoding available (the one
>> in Content-Encoding header, or some other we might have in the HTTP
>> response) and after that all the rest of w3af's code simply forgets
>> about encodings and uses the unicode string.
>> 
>> I haven't audited or reviewed the httplib, but the "from_httplib_resp"
>> method, looks extremely vulnerable:
>> 
>> 
>>    Vulnerable to what?
>> 
>>    resp = httplibresp
>> 
>>    code, msg, hdrs, body = (resp.code, resp.msg, resp.info(), resp.read())
>> 
>> 
>>    if original_url:
>> 
>>        url_inst = url_object(resp.geturl(), original_url.encoding)
>> 
>>    else:
>> 
>>        url_inst = original_url = url_object(resp.geturl())
>> 
>> 
>>    charset = getattr(httplibresp, 'encoding', None)
>> 
>>    return httpResponse(code, body, hdrs, url_inst,
>> 
>>                        original_url, msg, charset=charset)
>> 
>> 
>> I am just skeptical about assuming that the response of a webserver is valid
>> HTTP.
>> 
>> 
>>    We're not assuming that, if the response is not HTTP then httplib,
>> or urllib, or urllib2 (don't really know which one) will fail and
>> raise an exception.
>> 
>> That's why i mentioned py3k - it's exactly how Python3 handles external
>> data:
>> 
>> Everything is a ChunkOfUnidentified data until it gets converted to a
>> string. If it's a string, it's Unicode and everthing is fine. If not,
>> everthing breaks immediately.
>> 
>> 
>>    Could you explain me a little bit more about this? I tried to
>> google for ChunkOfUnidentified or ChunkOfUnidentifiedData and found
>> nothing.
>> 
>> 
>> Regards,
>> 
>> Daniel
>> 
>> 
>> Am 16.02.2012 um 13:33 schrieb Andres Riancho:
>> 
>> 
>> sends a string of bytes back to you in the HTTP response.
>> 
>> 
>> Do you have some code / a example where those exceptions usually appear in
>> the current w3af code?
>> 
>> 
>> Regards,
>> 
>> Daniel
>> 
>> 
>> Am 15.02.2012 um 22:06 schrieb Javier Andalia:
>> 
>> 
>> Hello Daniel,
>> 
>> 
>> On Wed, Feb 15, 2012 at 5:11 PM, Daniel Zulla
>> 
>> <daniel.zu...@googlemail.com> wrote:
>> 
>> What about switching over to Python3?
>> 
>> It solves the UnicodeDecodeException madness.
>> 
>> 
>> Can you please be more specific? What exactly do you have in mind?
>> 
>> 
>> Maybe I'm wrong, but the way I see it w3af would still
>> 
>> receive/transmit encoded bytes so there's no way to skip the
>> 
>> bytestring_to_unicode and unicode_to_bytestring conversions. Not even
>> 
>> in py3k.
>> 
>> 
>> Regards,
>> 
>> 
>> Javier
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Andrés Riancho
>> Director of Web Security at Rapid7 LLC
>> Founder at Bonsai Information Security
>> Project Leader at w3af
>> 
>> 
> 
> 
> 
> -- 
> Andrés Riancho
> Director of Web Security at Rapid7 LLC
> Founder at Bonsai Information Security
> Project Leader at w3af


------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
W3af-develop mailing list
W3af-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/w3af-develop

Reply via email to