I know it's only tangentially related to Nutch, but is no one
else interested in this? I've read the APIs and read a couple
of news stories about it, and it looks like you can download
the crawled data (for a relatively small fee: $1/GB).

This could be the thing that changes everything. The barrier
to entry to this field was fairly low using Nutch, but building
up a decent sized index takes time and a decent number of
machines. Now you can buy the crawled data, and literally
get a custom search engine running overnight.

I'm guessing that many would choose not to host their
front-end search on Alexa. In this case, Nutch/Lucene would
come in very handy. Just cram the Alexa data into a Lucene
index, and use Nutch as the front-end. Instant search engine...

Howie

It doesn't sound like they are offering the data itself, only access to
it, CPU cycles used for accessing it, upload of your own data, and
such.

In other words, it doesn't sound like you can just download a chunk of
data and do your own processing with it.  That would be one mighty
chunk! :)

Otis




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to