Paul Smith wrote:
I know there's a mapreduce branch in the nutch project, but is there any plan/talk of perhaps integrating something like that directly into the Lucene API? For projects that need a lower-level API like Lucene, rather than the crawl-like nature of Nutch, the potential to index lots of information in an efficient manner is very appealing indeed.
You can easily use NDFS and MapReduce from Nutch without using Nutch's crawler.
Perhaps we need to factor Nutch into two projects, one with NDFS and MapReduce and the other with the search-specific code. This falls almost exactly on package lines. The packages org.apache.nutch.{io,ipc,fs,ndfs,mapred} are not dependent on the rest of Nutch.
But you don't need to wait for such a split in order to use NDFS and MapReduce. Just check out the mapred branch from SVN and don't use the parts you don't need. If you find it useful, then argue for the creation of a new project.
Doug --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]