I'm not that familiar with lucene, but basically what im looking to accomplish is the equivalent of a whitespace tokenizer with my own list of delimiters, in Lucene docs it just looks like simple inheritance but I dont really see any examples in PyLucene on how to subclass a charTokenizer other then the class from the lia SimpleKeywordAnalyzer which does not appear to be used or work as far as i can see. I realize this is probably a bit out of place to be asking here, but could someone explain or show me a valid example of a custom analyzer using a custom charTokenizer in PyLucene?
Thanks in advance
_______________________________________________ pylucene-dev mailing list [email protected] http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
