your search engine would extract text content from a PDF file and all
markup, pictures etc would be lost. and so when you search you would get
only text, highlighted or not.


Best Regards
Alexander Aristov


On 18 February 2011 21:29, Gong Li <ee07b...@gmail.com> wrote:

> Hi,
>
> I am developing a PDF search engine, locally. I have used API: pdfbox and
> lucene.
>
> I must show the user the PDF page containing the keywords(if highlight,
> it's
> great) and sort by relevance(default in lucene). HOW???
>
> Maybe, if there are some pictures in the PDF page, how could it display to
> the user after index and search the extracted text???
>
> Thanks
>

Reply via email to