can nutch output xml?

Mike Whitman Wed, 24 Oct 2012 08:54:00 -0700

I have nutch crawling and solr indexing successfully and I have dumpedthe index to XML with Luke.

What I would like to do is generate one xml file per url crawled forloading into an XML database(MarkLogic). Yeah I can write a java orxquery tool to convert the 1 big xml file that Luke dumps to individualfiles.

Ideally nutch would output these files so I wouldn't need to have solr,Luke, and some tool I need to write in the content processing chain.KISS right?


Any thoughts on how to do this in the simplest way?

thanks,

Mike

can nutch output xml?

Reply via email to