as far as I know this feature is not implemented but it's possible to make.
if you are a developer I would suggest you to take a look at solr indexer. It can give you general idea how to read nutch data and transorm them into another one. Best Regards Alexander Aristov On 28 February 2011 14:57, Michael Lee <[email protected]> wrote: > Hi, I was looking at nutch as a crawler for indexing into Indri. In > Indri's > docs, it lists "warc" as a corpus class option described as "WARC (Web > ARChive) format, such as is output by the Nutch webcrawler" -- c.f. > http://lemur.sourceforge.net/indri/IndriIndexer.html > > After finishing a short crawl using nutch (v1.2), I found no way to produce > WARC output -- neither the native data store nor any of the export/dump > options appear to be WARC. I've inquired on Indri/Lemur forums about this, > but I thought I'd check here also if anyone knows what the docs might be > referring to... or how else I might proceed. > > Thanks! > -Michael >

