[W3af-develop] importResults.py: WebScarab import

Patrick Hof Mon, 27 Jul 2009 13:37:22 -0700

Hi list,

as promised, I've implement an import for WebScarab conversations. I was able to
pilfer large amounts of code from sqlmap's[0] WebScarab import, so most of the
credit belongs to them. Open Source FTW :). I modified the code for w3af and
also changed the coding style so it fits better with w3af. I've added the
support in almost the same way Jon did for Burp (see his email from before
yesterday).


While implementing this I also found an interesting bug in w3af. This took me a
while to figure out. At first, my import didn't seem to work, although the
result from my import, the list containing fuzzableRequests, seemed to be
perfectly fine. See my pdb session for the solution:

---------------------------8<-----------------------------------------
> /home/patrick/w3af/core/controllers/w3afCore.py(724)_discoverWorker()
-> if iFr not in self._alreadyWalked and urlParser.baseUrl( iFr.getURL() ) in 
cf.cf.getData('baseURLs'):
(Pdb) urlParser.baseUrl( iFr.getURL() ) in cf.cf.getData('baseURLs')
False
(Pdb) iFr
<QS fuzzable request | GET | http://192.168.56.101:80/ >
(Pdb) cf.cf.getData('baseURLs')
['http://192.168.56.101/']

--------------------------->8-----------------------------------------

The problem is that RFC2616 deems the URLs above equivalent (3.2.2: "If the port
is empty or not given, port 80 is assumed"), but w3af does not. I guess the same
holds true for https and port 443. WebScarab always seems to add the port, which
is where the problem arose from. When I added ":80" to my target URL, w3af
happily imported my WebScarab conversations. This is a rather subtle bug people
might trip over easily without ever realizing what's wrong.

Have fun with the code and I hope it's usable. If there are still bugs, they
must be from the sqlmap guys ;).


Patrick


[0] http://sqlmap.sourceforge.net

-- 
The Plague: You wanted to know who I am, Zero Cool? Well, let me explain 
            the New World Order. Governments and corporations need people
            like you and me. We are Samurai... the Keyboard Cowboys... and
            all those other people who have no idea what's going on are 
            the cattle... Moooo.
(Hackers)

'''
importResults.py

Copyright 2006 Andres Riancho

This file is part of w3af, w3af.sourceforge.net .

w3af is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation version 2 of the License.

w3af is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with w3af; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA

'''

import core.controllers.outputManager as om

# options
from core.data.options.option import option
from core.data.options.optionList import optionList

from core.controllers.basePlugin.baseDiscoveryPlugin import baseDiscoveryPlugin
from core.data.request.frFactory import createFuzzableRequestRaw
from core.controllers.w3afException import w3afRunOnce

import csv
import os.path
import re


class importResults(baseDiscoveryPlugin):
    '''
    Import URLs found by other tools.
    @author: Andres Riancho ( andres.rian...@gmail.com )
    '''
    def __init__(self):
        baseDiscoveryPlugin.__init__(self)

        # Internal variables
        self._exec = True
        
        # User configured parameters
        self._input_file = ''
        self._ws_conv_dir= ''

    def discover(self, fuzzableRequest ):
        '''
        Read the input file, and create the fuzzableRequests based on that information.
        
        @parameter fuzzableRequest: A fuzzableRequest instance that contains
                                    (among other things) the URL to test. It ain't used.
        '''
        if not self._exec:
            # This will remove the plugin from the discovery plugins to be runned.
            raise w3afRunOnce()

        else:
            self._exec = False
            res = []

            # Load data from CSV
            if self._input_file != '':
                try:
                    file_handler = file( self._input_file )
                except Exception, e:
                    msg = 'An error was found while trying to read the input file: "'
                    msg += str(e) + '".'
                    om.out.error( msg )
                else:
                    for row in csv.reader(file_handler):
                        obj = self._obj_from_file( row )
                        if obj:
                            res.append( obj )
            # Load data from WebScarab's saved conversations
            elif self._ws_conv_dir != '':
                try:
                    files = os.listdir( self._ws_conv_dir )
                except Exception, e:
                    msg = 'An error was found while trying to read the conversations directory: "'
                    msg += str(e) + '".'
                    om.out.error( msg )
                else:
                    files.sort()
                    for req_file in files:
                        # Only read requests, not the responses.
                        if not re.search( "([\d]+)\-request", req_file ):
                            continue
                        objs = self._objs_from_ws( os.path.join( self._ws_conv_dir, req_file ) )
                        res += objs
        return res

    def _obj_from_file( self, csv_row ):
        '''
        @return: A fuzzableRequest based on the csv_line.
        '''
        try:
            (method, uri, postdata) = csv_row
        except ValueError, value_error:
            msg = 'The file format is incorrect, an error was found while parsing: "'
            msg += str(csv_row) + '". Exception: "' + str(value_error) + '".'
            om.out.error( msg )
        else:
            # Create the obj based on the information
            return createFuzzableRequestRaw( method, uri, postdata, {} )
    
    def _objs_from_ws( self, req_file ):
        '''
        This code was largely copied from Bernardo Damele's sqlmap[0] . See
        __feedTargetsDict() in lib/core/options.py. So credits belong to the
        sqlmap project.

        [0] http://sqlmap.sourceforge.net/

        @author Patrick Hof
        '''
        res = []
        fp = open( req_file, "r" )
        fread = fp.read()
        fread = fread.replace( "\r", "" )
        req_res_list = fread.split( "======================================================" )
        
        port   = None
        scheme = None

        for request in req_res_list:
            if scheme is None:
                scheme_port = re.search(
                        "\d\d[\:|\.]\d\d[\:|\.]\d\d\s+(http[\w]*)\:\/\/.*?\:([\d]+)",
                        request,
                        re.I
                )

                if scheme_port:
                    scheme = scheme_port.group( 1 )
                    port   = scheme_port.group( 2 )

            if not re.search ( "^[\n]*(GET|POST).*?\sHTTP\/", request, re.I ):
                continue

            if re.search( "^[\n]*(GET|POST).*?\.(gif|jpg|png)\sHTTP\/", request, re.I ):
                continue

            method       = None
            url          = None
            postdata     = None
            headers      = {}
            get_post_req = False
            params       = False
            lines        = request.split( "\n" )

            for line in lines:
                if len( line ) == 0 or line == "\n":
                    continue

                if line.startswith( "GET " ) or line.startswith( "POST " ):
                    if line.startswith( "GET " ):
                        index = 4
                    else:
                        index = 5

                    url    = line[index:line.index(" HTTP/")]
                    method = line[:index-1]

                    if "?" in line and "=" in line:
                        params = True

                    get_post_req = True

                # XXX do we really need this? This is from the sqlmap code.
                # 'data' would be 'postdata' here. I can't figure out why this
                # is needed. Does WebScarab occasionally split requests to a new
                # line if they are overly long, so that we need to search for
                # GET parameters even after the URL was parsed? But that
                # wouldn't make sense with the way 'url' is set in line 168.
                # 
                # GET parameters 
                # elif "?" in line and "=" in line and ": " not in line:
                #     data    = line
                #     params  = True

                # Parse headers
                elif ": " in line:
                    key, value = line.split(": ", 1)
                    headers[key] = value

                # POST parameters
                elif method is not None and method == "POST" and "=" in line:
                    postdata = line
                    params   = True

            if get_post_req:
                if not url.startswith( "http" ):
                    url    = "%s://%s:%s%s" % ( scheme or "http", host, port or "80", url )
                    scheme = None
                    port   = None

                if url not in res:
                    res.append( createFuzzableRequestRaw( method, url, postdata, headers ) )
        return res
    
    def getOptions( self ):
        '''
        @return: A list of option objects for this plugin.
        '''
        d1 = 'Define the input file from which to create the fuzzable requests'
        h1 = 'The input file is comma separated and holds the following data:'
        h1 += ' HTTP-METHOD,URI,POSTDATA'
        o1 = option('input_file', self._input_file, d1, 'string', help=h1)

        d2 = 'Define the WebScarab conversations directory from which to create the fuzzable requests'
        h2 = 'Standard WebScarab conversations directory'
        o2 = option('ws_conv_dir', self._ws_conv_dir, d1, 'string', help=h2)
        
        ol = optionList()
        ol.add(o1)
        ol.add(o2)
        return ol
        
    def setOptions( self, optionsMap ):
        '''
        This method sets all the options that are configured using the user interface 
        generated by the framework using the result of getOptions().
        
        @parameter optionsMap: A dictionary with the options for the plugin.
        @return: No value is returned.
        ''' 
        self._input_file = optionsMap['input_file'].getValue()
        self._ws_conv_dir =  optionsMap['ws_conv_dir'].getValue()
        
    def getPluginDeps( self ):
        '''
        @return: A list with the names of the plugins that should be runned before the
        current one.
        '''
        return []
    
    def getLongDesc( self ):
        '''
        @return: A DETAILED description of the plugin functions and features.
        '''
        return '''
        This plugin serves as an entry point for the results of other tools that
        search for URLs.  The plugin reads an input file that is comma separated
        and holds the following data:
        HTTP-METHOD,URI,POSTDATA.
        It also reads URLs from WebScarab saved sessions. WebScarab saves all
        requests sent in a directory named "conversations" by default. If this
        directory is set as ws_conv_dir, this plugin automatically reads all
        available request URLs found there.
        
        Two configurable parameter exists:
            - input_file
            - ws_conv_dir
        '''

------------------------------------------------------------------------------

_______________________________________________
W3af-develop mailing list
W3af-develop@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/w3af-develop

[W3af-develop] importResults.py: WebScarab import

Reply via email to