Hello everyone,

 

I have Nutch installed and running just fine. Nutch submits the crawl
results to Solr for indexing. I need to have a separate field in Solr
document that would hold raw HTML. At the moment, the "content" field holds
the parsed text from the page only.

 

>From what I read, it's impossible to do what I need without writing your own
plugin. I don't know Java that well. What would be the easiest way to
approach this task?

 

 

Thank you in advance,

Max

Reply via email to