is PyLucene able to handle a custom tokenization
without any stemming process ?

PyLucene is able to handle anything Java Lucene is capable of handling since it wraps it (except for the RemoteSearchable class).

If you have questions on how to use Lucene, that are not specific to python, you should join the [email protected] mailing list where there is a wealth of information about how to use Lucene.

Also, most of all the 'Lucene in Action' book's samples are available, in python, in the PyLucene distribution. For examples on how to write a custom analyzer in python for PyLucene, look at the files in samples/LuceneInAction/lia/analysis.

Andi..


actually i would like to feed the index myself with
words from different languages (thus inconsistant
tokenization), but also sgml tags, and maybe even some
numbers,

will it be possible ? where can i found hints on where
to look after that ?

best regards,

J.






___________________________________________________________________________
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger
Téléchargez cette version sur http://fr.messenger.yahoo.com
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to