Sure ... The frequency count is maintained in the index to enable
relevance scoring. You can pull it out using a TermDocs, which
enumerates this sort of information. Sorry, I don't have example code
handy for this.
-Mike
On 1/1/2013 4:24 PM, Itai Peleg wrote:
That worked great :) thanks a lot for the quick reply!
I have another question - after I "flagged" all my special tokens (in my
case, the ones that are entities) is there an elegant way of counting how
many of them I have in a document? I found an ugly way to do that, but I'm
sure there's a better one.
Thanks in advance,
Itai
2012/12/31 Michael Sokolov <soko...@ifactory.com>
On 12/31/2012 11:39 AM, Itai Peleg wrote:
Hi all,
Can someone please post a simple example showing how to add additional
attributes to token in a TokenStream (inside IncrementToken for example?).
I'm working on entity extraction and want to flag specific tokens an
entities, but I'm having problems.
Thanks in advance,
Itai
Here's a simple example of a filter that adds an atytribute saying
whether a token is "the"
class YourTokenStream extends TokenFilter {
private final YourAttribute att;
private final CharTermAttribute term;
private final TokenStream source;
public YourTokenStream (TokenStream upstream) {
att = addAttribute (YourAttribute.class);
term = addAttribute (CharTermAttribute.class);
source = upstream;
}
public boolean incrementToken () {
if (source.incrementToken()) ?? {
if ("the".equals (new String(term.buffer())) {
att.setIsAnEnglishArticle(**true);
return true;
}
return false;
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org