Hi Koji,
Thank you again for your continued assistance.
The code below details the code I used in Lucene 2.4 to highlight terms
(which did not correctly highlight terms).
>From your previous email, is there a way to access a TermVector
containing only matched terms, or is my previous approach still the
correct way to proceed.
Best,
Steve
public HitTermTagger(Scorer pScorer) {
_mScorer = pScorer;
}
public ArrayList<KeyValuePair<Integer,Integer>> tagText(TokenStream
ptsTokenStream)
{
StringBuilder sbTaggedText = new StringBuilder();
final Token reusableToken = new Token();
int startOffset = -1;
int endOffset = -1;
float score;
ArrayList<KeyValuePair<Integer,Integer>> results =
new ArrayList<KeyValuePair<Integer,Integer>>();
TokenGroup tokenGroup = new TokenGroup();
//initialize scorer
_mScorer.startFragment(null);
try {
for(Token nextToken = ptsTokenStream.next(reusableToken);
(nextToken != null);
nextToken = ptsTokenStream.next(reusableToken))
{
//if((tokenGroup.getNumTokens() >
0)&&(tokenGroup.isDistinct(nextToken))) {
if(nextToken.startOffset() > endOffset) {
score = _mScorer.getTokenScore(nextToken);
if(score > 0.0) {
startOffset = nextToken.startOffset();
endOffset = nextToken.endOffset();
results.add(new
KeyValuePair<Integer,Integer>(startOffset,endOffset));
}
//tokenGroup.clear();
}
//tokenGroup.addToken(nextToken,_mScorer.getTokenScore(nextToken));
}
}
-----Original Message-----
From: Koji Sekiguchi [mailto:[email protected]]
Sent: Monday, April 19, 2010 9:02 PM
To: [email protected]
Subject: Re: Term offsets for highlighting
Stephen Greene wrote:
> Hi Koji,
>
> An additional question. Is it possible to access the FieldTermStack
from
> the FastVectorHighlighter after the it has been populated with
matching
> terms from the field?
>
> I think this would provide an ideal solution for this problem, as
> ultimately I am only concerned with returning positional offsets to
have
> highlighting tags applied to them in a separate process.
>
> Thank you for your insight,
>
> Steve
>
Hi Steve,
You cannot access FieldTermStack from FVH, but I think you
can create it by your own. To know how to do it, please refer to
FieldTermStackTest.java. To instantiate FieldTermStack, FieldQuery
object is needed. And FieldQuery object can be obtained from FVH.
But I don't understand why you need FieldTermStack. Just using
Lucene's TermVector with offsets (and positions, if necessary) doesn't
solve your problem?
Koji
--
http://www.rondhuit.com/en/
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]