You can try using the string type as below: <field name="content" type="string" stored="true" indexed="true"/>
On Wed, Aug 10, 2011 at 6:20 AM, Markus Jelsma <[email protected]>wrote: > I'm not sure how to do this but i think creating an parse and indexing > filter > will do the trick. First you make the parse filter that reads the byte[] > content from the Content object that is available in the parse filter. You > then add the raw data in that parse filter to the parse data. > > In your indexing filter you simply read that field and add it to the > document. > See writing plugin example on the wiki for basic introduction to writing > plugins. > > On Wednesday 10 August 2011 14:12:13 Christopher Gross wrote: > > I have Nutch 1.3 running, and have it connected to a Solr 3.3 > > instance. Right now the data comes over from Nutch to Solr just fine, > > but I'd like it to send the "content" field to Solr as the raw HTML, > > so that I can have all the original markup to work with later. > > > > I've tried digging around on Google and I can't seem to find anything. > > Can someone please push me in the right direction? > > > > Thanks! > > > > -- Christopher Gross > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 >

