Out of curiosity, when you try your other test string "aaa _bbb ccc"
what do the token byte offsets show?
Matt
Mark Harwood wrote:
On 1 Jul 2009, at 17:39, k.sayama wrote:
I could verify Token byte offsets
The sytsem outputs
aaa:0:3
bbb:0:3
ccc:4:7
That explains the highlighter behaviour. Clearly BBB is not at
position 0-3 in the String you supplied
String CONTENTS = "AAA :BBB CCC";
Looks like the Tokenizer needs fixing. Is this yours or a standard
Lucene class? If the latter, raising a JIRA bug with a Junit test
would be the best way to get things moving.
Cheers
Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org