Re: [PHP] parsing form with a website question...
On Thu, 2008-08-14 at 15:47 -0700, bruce wrote: Hi guys... Got a question that I figured I'd ask before I reinvent the wheel. A basic website has a form, or multiple forms. within the form, there might be multiple elements (lists/select statements, etc...). each item would have a varname, which would in turn be used as part of the form action, to create the entire query... sort of like: form action=test.php? option name=foo foo=1 foo=2 foo=3 foo=4 /option option name=cat cat=1 cat=2 cat=3 /option /form so you'd get the following urls in this psuedo example: test.php?foo=1cat=1 test.php?foo=1cat=2 test.php?foo=1cat=3 test.php?foo=2cat=1 test.php?foo=2cat=2 test.php?foo=2cat=3 test.php?foo=3cat=1 test.php?foo=3cat=2 test.php?foo=3cat=3 test.php?foo=4cat=1 test.php?foo=4cat=2 test.php?foo=4cat=3 i'm looking for an app that has the ability to parse any given form on a web page, returning the complete list of possible url combinations based on the underlying elements that make up/define the form... anybody ever seen anything remotely close to this...??? i've been research crawlers, thinking that this kind of functionality would already exist, but so far, no luck! A little algorithm analysis would learn you that to do so would require storage space on an exponential scale... as such you won't find it. Also, what would you put into text/textarea fields? I've heard Google has begun experiments to index the deep web, but they just take somewhat educated guesses at filling in forms, not at expanding the exponential result set. For a simple analysis of the problem. Take 2 select fields with 2 options each... you have 4 possible outcomes (2 * 2). Now take 3 selects lists with 3 items, 4 items, and 5 items. You now have 60 possible outcomes. From this it is easy to see the relation ship is a * b * c * ... * x. So take a form with 10 select fields each with 10 items. That evaluates to 10^10 = 100. In other words, with a mere 10 drop down selects each with 10 items, the solution space consists of 10 billion permutations. Now lets say each item costs exactly 1 byte to store the answer, and so you need 10 bytes to store one particular solution set. That's 100 billion bytes AKA 100 metric gigabytes... remember that was just 1 form. Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] parsing form with a website question...
rob, i'm fully aware of the issues, and for the targeted sites that i'm focusing on, i can employ strategies to prune the tree... but the overall issue is that i'm looking for a tool/app/process that does what i've described. the basic logic is that the app needs to use a config file, and that the app should somehow find the requisite form using perhaps xpath, in combination with some kind of pattern recognition/regex functionality... once the app has the form, it can then get the underlying stuff (selects/lists/items, etc.. which will form the basis for the querystrings to the form action... ain't life grand!! thanks... -Original Message- From: Robert Cummings [mailto:[EMAIL PROTECTED] Sent: Thursday, August 14, 2008 4:57 PM To: bruce Cc: php-general@lists.php.net Subject: Re: [PHP] parsing form with a website question... On Thu, 2008-08-14 at 15:47 -0700, bruce wrote: Hi guys... Got a question that I figured I'd ask before I reinvent the wheel. A basic website has a form, or multiple forms. within the form, there might be multiple elements (lists/select statements, etc...). each item would have a varname, which would in turn be used as part of the form action, to create the entire query... sort of like: form action=test.php? option name=foo foo=1 foo=2 foo=3 foo=4 /option option name=cat cat=1 cat=2 cat=3 /option /form so you'd get the following urls in this psuedo example: test.php?foo=1cat=1 test.php?foo=1cat=2 test.php?foo=1cat=3 test.php?foo=2cat=1 test.php?foo=2cat=2 test.php?foo=2cat=3 test.php?foo=3cat=1 test.php?foo=3cat=2 test.php?foo=3cat=3 test.php?foo=4cat=1 test.php?foo=4cat=2 test.php?foo=4cat=3 i'm looking for an app that has the ability to parse any given form on a web page, returning the complete list of possible url combinations based on the underlying elements that make up/define the form... anybody ever seen anything remotely close to this...??? i've been research crawlers, thinking that this kind of functionality would already exist, but so far, no luck! A little algorithm analysis would learn you that to do so would require storage space on an exponential scale... as such you won't find it. Also, what would you put into text/textarea fields? I've heard Google has begun experiments to index the deep web, but they just take somewhat educated guesses at filling in forms, not at expanding the exponential result set. For a simple analysis of the problem. Take 2 select fields with 2 options each... you have 4 possible outcomes (2 * 2). Now take 3 selects lists with 3 items, 4 items, and 5 items. You now have 60 possible outcomes. From this it is easy to see the relation ship is a * b * c * ... * x. So take a form with 10 select fields each with 10 items. That evaluates to 10^10 = 100. In other words, with a mere 10 drop down selects each with 10 items, the solution space consists of 10 billion permutations. Now lets say each item costs exactly 1 byte to store the answer, and so you need 10 bytes to store one particular solution set. That's 100 billion bytes AKA 100 metric gigabytes... remember that was just 1 form. Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] parsing form with a website question...
At 7:57 PM -0400 8/14/08, Robert Cummings wrote: On Thu, 2008-08-14 at 15:47 -0700, bruce wrote: -snip- That's 100 billion bytes AKA 100 metric gigabytes... remember that was just 1 form. Cheers, Rob. Killjoy. :-) He could have had a lot of fun figuring that out. tedd -- --- http://sperling.com http://ancientstones.com http://earthstones.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] parsing form with a website question...
bruce wrote: rob, i'm fully aware of the issues, and for the targeted sites that i'm focusing on, i can employ strategies to prune the tree... but the overall issue is that i'm looking for a tool/app/process that does what i've described. the basic logic is that the app needs to use a config file, and that the app should somehow find the requisite form using perhaps xpath, in combination with some kind of pattern recognition/regex functionality... once the app has the form, it can then get the underlying stuff (selects/lists/items, etc.. which will form the basis for the querystrings to the form action... Don't know of anything that does this off hand but it'd be a good project for a security check app :) See what values/options the form accepts and what it fails with.. -- Postgresql php tutorials http://www.designmagick.com/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] parsing form with a website question...
On Thu, 2008-08-14 at 21:39 -0400, tedd wrote: At 7:57 PM -0400 8/14/08, Robert Cummings wrote: On Thu, 2008-08-14 at 15:47 -0700, bruce wrote: -snip- That's 100 billion bytes AKA 100 metric gigabytes... remember that was just 1 form. Cheers, Rob. Killjoy. :-) He could have had a lot of fun figuring that out. He was lookig for a premade solution... it didn't seem like he wanted to figure it out :) Cheers, Rob. -- http://www.interjinn.com Application and Templating Framework for PHP -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php