hi... i'm playing around with an app that parses websites and extracts information, returning certain information to my system.
my primary issue has to do with how i might architect the system to place the information into my database. i'm using/testing with mysql. my question has to do with how to scale this kind of system. if i have a server, that's spawing 100's of apps with each app firing off a web/page connection to a web server, i'm going to have more than enough connections coming back to swamp out writing to a mysql server... so how do other apps/crawlers handle this kind of situation... basically, i'm trying to figure out how to implement some kind of scaling funneling process/mechanism to allow me to have 10-20 servers crawling the specific sites, and returning the information to a database... any thoughts/comments/pointers on how to deal with this will be helpful!! thanks -bruce
