Hello, I've only really taken an introductory look at Droids and ran through the samples. I think I'll be using Droids for an upcoming project. I have a couple of questions first:
I ran both the SimpleRuntime example and the Cli example through a site I wish to parse. Droids seems to keep an index of the links in the page to parse and those parsed already - where is that list? In memory? Is it the queue? How big can that queue grow to? The site I will be crawling will be around 500,000 pages - is this a number that could be supported? Can the index be persisted using a DB instead of being stored in memory? Some of the links to content I wish to crawl/parse/index are JavaScript pop ups - therefore I wish to alter the url for the crawler to use; this should be no problem right? Regards, Robin
