On Wed, Apr 11, 2012 at 12:50 PM, Andres Riancho
<andres.rian...@gmail.com> wrote:
> Taras,
>
> On Wed, Apr 11, 2012 at 12:11 PM, Andres Riancho
> <andres.rian...@gmail.com> wrote:
>> On Wed, Apr 11, 2012 at 4:56 AM, Taras <ox...@oxdef.info> wrote:
>>> Andres,
>>>
>>>
>>>>>>     If the framework IS working like this, I think that the shared
>>>>>> fuzzable request list wouldn't do much good. If it is not working like
>>>>>> this (and I would love to get an output log to show it), it seems that
>>>>>> we have a lot of work ahead of us.
>>>>>
>>>>>
>>>>> And w3afCore need to filter requests from discovery plugins on every loop
>>>>> in
>>>>> _discover_and_bruteforce(), am I right?
>>>>
>>>>
>>>> It should filter things as they come out of the plugin and before
>>>> adding them to the fuzzable request list,
>>>
>>> Agree, but as I see in w3afCore.py there is no filtering in it.
>>> I just have added it [0]. It shows good results on the test suite (see
>>> attachment).
>>>
>>> Without filtering:
>>>  Found 2 URLs and 87 different points of injection.
>>>  ...
>>>  Scan finished in 3 minutes 30 seconds.
>>>
>>> With filtering:
>>>  Found 2 URLs and 3 different points of injection.
>>>  ...
>>>  Scan finished in 11 seconds.
>>
>> Reviewing this and reproducing in my environment. Will have some opinions in 
>> ~1h
>
> All right... now I see your concern and understand it. I run the scan
> you proposed and was able to reproduce the issue, which is actually
> generated by a simple constant:
>
>    webSpider.py:
>    MAX_VARIANTS = 40
>
> Let me explain what is going on here and what your patch is doing:
>    #1 In the current trunk version, w3af's webSpider is parsing the
> index.php file you sent and identifies many links, most of them
> variants of each other. Before returning them to the w3afCore the
> webSpider uses the variant_db class and MAX_VARIANTS to define if
> enough variants of that link have been analyzed. If there are not
> enough then the variant needs to be analyzed so it is returned to the
> core. Given that MAX_VARIANTS is 40 [Note: I changed this to 5 in the
> latest commit.], the webSpider returns all/most of the links in your
> index.php to the core.
>
>    a) This makes sense, since a link to a previously unknown section
> might be present in "article.php?id=25" and NOT present in
> "article.php?id=35", so w3af needs to make a choice on how many of
> those variants are going to be analyzed and how many are going to be
> left out.
>
>    b) The same happens with vulnerabilities, there might be a
> vulnerability in the foo parameter of "article.php?id=28&foo=bar" when
> the id=28 and the vulnerability might NOT be present when the id is
> 32.
>
>    #2 With your patch, which filters all variants and "flattens" the
> previously found ones, w3afCore only ends up with
> "article.php?id=number" and ""article.php?id=number&foo=string" ,
> which won't allow for other discovery plugins to analyze the variants
> (#1 - a) and audit plugins to identify the more complex
> vulnerabilities (#1 - b). What will happen (of course) is that the
> scanner will be VERY fast.
>
> But lets try to understand what happens with the audit plugins when
> they are presented with multiple variants. According to 1-b they
> should send multiple requests and those should generate a lot of
> network traffic, slowing the scan down. Here is a grep of a scan with
> the audit.sqli plugin enabled:
>
> dz0@dz0-laptop:~/workspace/w3af$ grep "d'z\"0" output-w3af.txt
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=145&foo=d'z"0
> returned HTTP code "200" - id: 93
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0&foo=bar
> returned HTTP code "200" - id: 94
> GET http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0
> returned HTTP code "200" - id: 96
> GET http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0
> returned HTTP code "200" - id: 98 - from cache.
> GET http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0
> returned HTTP code "200" - id: 100 - from cache.
> GET http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0
> returned HTTP code "200" - id: 102 - from cache.
> GET http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0
> returned HTTP code "200" - id: 104 - from cache.
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0&foo=bar
> returned HTTP code "200" - id: 106 - from cache.
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=122&foo=d'z"0
> returned HTTP code "200" - id: 107
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0&foo=bar
> returned HTTP code "200" - id: 109 - from cache.
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=119&foo=d'z"0
> returned HTTP code "200" - id: 110
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0&foo=bar
> returned HTTP code "200" - id: 112 - from cache.
> GET http://moth/w3af/discovery/web_spider/variants/article.php?id=82&foo=d'z"0
> returned HTTP code "200" - id: 113
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0&foo=bar
> returned HTTP code "200" - id: 115 - from cache.
> GET http://moth/w3af/discovery/web_spider/variants/article.php?id=75&foo=d'z"0
> returned HTTP code "200" - id: 116
>
> The most important thing to notice here are the repeated HTTP requests
> to the variants and the "from cache" strings at the end of the
> repeated requests. For example:
>
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0&foo=bar
> returned HTTP code "200" - id: 93
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=d'z"0&foo=bar
> returned HTTP code "200" - id: 95 - from cache.
>
> And then, we're following the logic from #1-b and actually sending
> these two requests to the remote web application:
>
> GET 
> http://moth/w3af/discovery/web_spider/variants/article.php?id=215&foo=d'z"0
> returned HTTP code "200" - id: 96
> GET http://moth/w3af/discovery/web_spider/variants/article.php?id=29&foo=d'z"0
> returned HTTP code "200" - id: 105
>
> I'm not saying that this is all perfect. The downsides of this scan
> strategy are:
>    * Slow
>        - Because more HTTP requests are sent
>        - Because more pattern matching is applied to more HTTP responses
>        - Because (maybe) the responses that are retrieved from the
> cache are slow to get
>        - Because "MAX_VARIANTS = 40" was too high
>
> But of course this has good things like #1-a and #1-b, which provides
> the scanner with better code coverage at the end.
>
> Maybe we could have different scan strategies, or change MAX_VARIANTS
> to be a user defined parameter, or... (please send your ideas).-

Forgot to mention that this can be reproduced with an updated trunk
and moth. I commited all test scripts so you guys can run:

./w3af_console -s scripts/script-web_spider-variants.w3af

And get my same results.

> Regards,
>>>
>>>
>>>> Please let me know if the discovery process is NOT working as we
>>>> expect and if we have to filter stuff somewhere
>>>
>>> See above.
>>>
>>> [0] https://sourceforge.net/apps/trac/w3af/changeset/4861
>>> --
>>> Taras
>>> http://oxdef.info
>>
>>
>>
>> --
>> Andrés Riancho
>> Project Leader at w3af - http://w3af.org/
>> Web Application Attack and Audit Framework
>
>
>
> --
> Andrés Riancho
> Project Leader at w3af - http://w3af.org/
> Web Application Attack and Audit Framework



-- 
Andrés Riancho
Project Leader at w3af - http://w3af.org/
Web Application Attack and Audit Framework

------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
W3af-develop mailing list
W3af-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/w3af-develop

Reply via email to