Hi Lance,

> About removing non-nouns: the OpenNLP patch includes two simple 
> TokenFilters for manipulating terms with payloads. The 
> FilterPayloadFilter lets you keep or remove terms with given payloads.

yes, I used this already in the schema.xml
> <filter class="solr.FilterPayloadsFilterFactory" 
> payloadList="NN,NNS,NNP,NNPS,FM" keepPayloads="true"/>
> <filter class="solr.StripPayloadsFilterFactory"/>

Works fine :-)
But as Robert Muir stated in LUCENE-4345 I also think using types (and storing 
these optionally as payloads)
would be a better approach.

> http://code.google.com/p/universal-pos-tags/
Thanks for the pointer, used it to improve my english (brown) whitelist for 
UIMA :-)

Regards,

Kai Gülzau

Reply via email to