Quoting mark harwood <[EMAIL PROTECTED]>:
Hi, Mark,
I just used StandardAnalyzer and code is as following:
=====================================================
Analyzer analyzer = new StandardAnalyzer();
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String line = in.readLine();
if (line.length() == -1)
return;
Query query = QueryParser.parse(line, "contents", analyzer);
Hits hits = searcher.search(query);
Highlighter highlighter =new Highlighter(new QueryScorer(query));
========================================================
Part of the fulltext result is:
========================================================
sectors in South Africa have failed to incorporate many co-management
principles, such as joint... of the RNP. However, poor representation of
community interests on the joint management committee... of a joint management
committee and the improvement of infrastructure in the area. The key differences
......
========================================================
Of course the result is far more than this.
The paper itself looks normal to me. It makes me confused why this is happening.
Thanks for your help,
Ying
> >> One of my
> >> search results from our
> >> records contains far too much of the text
>
> This is a problem I haven't seen before. I suspect it
> may have something to do with your choice of analyzer.
> Your paper will only ever be fragmented on "token gap"
> boundaries ie points in the token stream where the
> current token position does not overlap with the
> previous token's . If the section in your text which
> contains the search terms contains a long stream of
> overlapping tokens you will end up with a long
> highlighted selection.
>
> Which analyzer are using out of interest?
>
>
> Cheers
> Mark
>
>
>
> --- [EMAIL PROTECTED] wrote:
> >
> >
> > Hi, All,
> >
> > I use lucene highlight package to generate KWIC for
> > our application.
> >
> > The part of the code is as following:
> >
> =====================================================
> > if(text != null ){
> > TokenStream tokenStream =
> > analyzer.tokenStream("contents",
> > new StringReader(text));
> > // Get 3 best fragments and seperate with
> > a "..."
> > result =
> > highlighter.getBestFragments(tokenStream,
> > text, 3, "...");
> > }
> >
> >
> =====================================================
> >
> > However, I got a very strange problem. One of my
> > search results from our
> > records contains far too much of the text of the
> > paper. It doesn't happen
> > for the same paper when I changed the search
> > criteria.
> >
> > Thanks very mcuh for your help,
> > Ying
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > [EMAIL PROTECTED]
> > For additional commands, e-mail:
> > [EMAIL PROTECTED]
> >
> >
>
>
>
>
>
> ___________________________________________________________
> Yahoo! Messenger - want a free and easy way to contact your friends online?
> http://uk.messenger.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]