david.stu...@progressivealliance.co.uk wrote:
Hi All,
I think I am just about finished my plugin (nutch 1.0) which adds extra
metadata to during parsing the problem I am having is it doesn't seem to
be adding the data to the system (via luke or readseg). I looked at in
the wiki but it seems to be for 0.9 and the syntax looks different.
{code}
public ParseResult filter(Content content, ParseResult parseResult,
HTMLMetaTags metaTags, DocumentFragment doc) {
Metadata metadata = new Metadata();
// parse the content
DocumentFragment root;
String docTrans;
try {
byte[] contentInOctets = content.getContent();
String input = new String(contentInOctets);
XSLTSimpleTransform DocTransform = new XSLTSimpleTransform();
docTrans = DocTransform.doTransform(input);
Parse parse = parseResult.get(content.getUrl());
metadata = parse.getData().getParseMeta();
metadata.add("filter_html_data", docTrans);
} catch (Exception e) {
e.printStackTrace(LogUtil.getWarnStream(LOG));
}
return parseResult;
}
{code}
Did you declare that you are adding this field in the
IndexingFilter.addIndexBackendOptions(..) ? See how other indexing
plugins do this.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com