Re: Highligheter fails using JapaneseAnalyzer

k.sayama Wed, 01 Jul 2009 09:40:16 -0700

I could verify Token byte offsets

The sytsem outputs
aaa:0:3
bbb:0:3
ccc:4:7


offset is initialized

Is this problem Analyzer?　Or, is it Tokenizer?

----- Original Message -----From: "mark harwood" <markharw...@yahoo.co.uk>

To: <java-user@lucene.apache.org>
Sent: Thursday, July 02, 2009 12:55 AM
Subject: Re: Highligheter fails using JapaneseAnalyzer

How should I verify  it?

Make sure the Token.startOffset and endOffset properties of Tokens producedby your TokenStream correctly define the location of Token.termBuffer in theoriginal text.




----- Original Message ----
From: k.sayama <sake-gin...@nifty.com>
To: java-user@lucene.apache.org
Sent: Wednesday, 1 July, 2009 16:13:17
Subject: Re: Highligheter fails using JapaneseAnalyzer

Sorry
I can not verify the Token byte offsets produced by JapaneseAnalyzer
How should I verify  it?

----- Original Message -----From: "mark harwood" <markharw...@yahoo.co.uk>

To: <java-user@lucene.apache.org>
Sent: Wednesday, July 01, 2009 11:31 PM
Subject: Re: Highligheter fails using JapaneseAnalyzer

Can you verify the Token byte offsets produced by this particular analyzer
are correct?

----- Original Message ----
From: k.sayama <sake-gin...@nifty.com>
To: java-user@lucene.apache.org
Sent: Wednesday, 1 July, 2009 15:22:37
Subject: Re: Highligheter fails using JapaneseAnalyzer

hi

I verified it by using SimpleAnalyzer, StandardAnalyzer, and CJKAnalyzer.
but, The problem did not happen.

I think the problem of JapaneseAnalyzer.
Can this problem be solved?

Does the same thing happen when you use SimpleAnalyzer, or
StandardAnalyzer?

I have a sneaking suspicion that the : in your contents string is what's
causing your issue here, as : is a reserved character that denotes a
field specification. But I could be wrong.

Try swapping analyzers, if you no longer have the same issue with
Simple, try Standard. Assuming the same problem shows up there, I think
you might need to do something about the :.

Matt

k.sayama wrote:

hello.

i've tried to highlight string using Highligheter(2.4.1) and
JapaneseAnalyzer
but the following code extract show the problem

String F = "f";
String CONTENTS = "AAA :BBB CCC";
JapaneseAnalyzer analyzer = new JapaneseAnalyzer();
QueryParser qp = new QueryParser( F, analyzer );
Query query = qp.parse( "BBB" );
Highlighter h = new Highlighter( new QueryScorer( query, F ) );

System.out.println( h.getBestFragment( analyzer, F, CONTENTS ) );

The sytsem outputs
<B>AAA</B> :BBB CCC

When you change CONTENTS to "AAA _BBB CCC"
the system outputs

AAA _<B>BBB</B> CCC

Are there any problems?
Thanks in advance

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Highligheter fails using JapaneseAnalyzer

Reply via email to