Hi, is there a Tokenizer in Lucene, that tokenizes XML correctly?
I.e. that one gets from the following XML: <span>this is <span attr="foo">example</span>text.</span> Tokens (or similar): <span> | this | is | <span attr="foo"> | example | </span> | text. | </span> Or would i need to write such a Tokenizer myself? regards Christoph Hermann -- Christoph Hermann Institut für Informatik Tel: +49 761-203-8171 Fax: +49 761-203-8162 e-mail: herm...@informatik.uni-freiburg.de --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org