Additionally, how hard would it be to add Crawlers for things like:

1. IMAP and other mail stores (even things like PST files, etc.)
2. Somewhat strange: Databases. Just point it at a DB and have it suck in tables/rows/columns
3. Things like web APIs (Flickr, del.icio.us, etc),

Any comments on fault tolerance and incremental crawling would also be appreciated. Is there anything in the current design that you think prevents these things?

Thanks,
Grant

On Aug 27, 2008, at 5:26 PM, Grant Ingersoll wrote:

Is there a feature list for Droids anywhere?

Or, can it do:

1. Honor robots.txt
2. Crawl throttling
3. Distributed crawling (i.e. give a bunch of links to it and some distributed compute resources and have it go to town)

Thanks,
Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to