Yes, this can be easily done using TokenStream class and hence getting
the the BestTokens.
But ofcourse you have to have this content in the index.
DONE
Ramesh Reddy
On Wed, 2006-07-12 at 12:43 +0100, Mike Streeton wrote:
> The simplest solution is always the best - when storing the p
Sweet!
The simplest solution is always the best - when storing the page, do not
break up sentences. So a page will be all the sentences that occur on
it. If a sentence starts on one page and finishes on the next it will be
included in both pages in the index.
Hope this helps
Mike
www.ardentia.com the h
Hello Erick,
I have been trying on Google Books some scenarios and apparently found a
Google bug ...
It looks like they use number 2 approach, as this query illustrates it.
http://books.google.com/books?vid=ISBN1564968316&id=14Xx2T8tmMYC&pg=PA8&lpg=PA8&dq=%2B%22the+site+is+unburdened%22&sig=QR
I can think of several approaches, but the experts will no doubt show me up
..
1> index the entire book as a single document. Also, index the beginning and
ending offset of each page in separate "documents". Assuming you can find
the offset in the big doc of each matching phrase, you can also fin