Hi C.B., Quite a few things here
On Fri, Jul 15, 2011 at 5:19 PM, Cam Bazz <[email protected]> wrote: > Hello, > > Finally I got a working build environment, and I am doing some > modifications and playing around. > Good to hear, although it is off topic can you share any hurdles you overcame with us please. It would be good to hear how you solved you configuration problems. > I also got my first plugin to build, and almost done with my custom parser. > Excellent, I will proceed with adding your comment to a page in plugin central on the wiki, in the meantime it would be good to hear more about your plugin and what functionality it encapsulates! Would it be possible to get a wiki entry? We are a bit short for Nutch 1.3 custom plugin tutorials. > > I have my custom plugin and the method > > public ParseResult filter(Content content, ParseResult parseResult, > HTMLMetaTags metaTags, DocumentFragment doc) { ... > > does indeed have all the information that I need to do my custom parsing. > > Now this is what I dont understand: there is a content field in solr. > I have read the solrindexer code, and figured out that pretty much any > field in the doc is indexed to solr. > If you have a look at boht your schema and solr-mapping documents you will see how fields are generated and passed to Solr for indexing. > > What must I do, so I can open another content like field such as > "Content2" and put my custom extracted data, so solr indexes it? I > think this does not have to do with solr, but the fields in the > document. > My suggestion would be to specify extraction of the field within the plugin code then add the various configuration parameters to both of the aforementioned config documents. > > In the recommended example, the found result is only added to > contentMeta - and this one is not indexed by solr. > What recommended example? I am not following you here. > > Best Regards, > -C.B. > -- *Lewis*

