I added some code you advised and the result is as follows: Text: AaaBCcDdEFGgHhIiJKkLMmN
Pos start end Inc Ofst Ofst [Aaa] 1 0 3 [B] 1 3 4 [Cc] 1 4 6 [Dd] 1 6 8 [E] 1 8 9 [F] 1 9 10 [Gg] 1 10 12 [Hh] 1 12 14 [Ii] 1 14 16 [J] 1 16 17 [Kk] 1 17 19 [L] 1 19 20 [Mm] 1 20 22 [N] 1 22 23 Output: <B>AaaBCcDdEFGgHhIiJKkLMmN</B> It seems JapaneseAnalyzer produces correct tokens to me. Any thoughts? Koji > -----Original Message----- > From: markharw00d [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 06, 2005 3:37 PM > To: java-user@lucene.apache.org > Subject: Re: Highlighter apply to Japanese > > > I don't know the behaviour of the Japanese Analyzer you are using. > Can you add to your example diagnosis the Token.getPositionIncrement, > Token.startOffset and Token.endOffset for each of the tokens? > > The highlighter groups tokens with overlapping start and end offsets > into a single TokenGroup for the purposes of highlighting. > This allows > TokenStreams which produce multiple synonyms for the same > source token > to work. This behaviour was also required to get the CJKAnalyzer to > work. It could be that the Analyzer you are using is > producing a stream > of tokens which *all* overlap? > > Cheers > Mark > > > > ___________________________________________________________ > To help you stay safe and secure online, we've developed the > all new Yahoo! Security Centre. http://uk.security.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]