Hi!
We could do this for certain type of documents.
But for PDF files, I think we should use a new feature provided by PDFBox,
PdfHighlighter.
This is actually using an Acrobat feature described here :
http://partners.adobe.com/public/developer/en/pdf/HighlightFileFormat.pdf

When the user selects the link "View cache" or "View highlight", we could
generate the XML highlight file and use it to highlight the hits directly
inside the PDF.
That's even better than Google cache...
We could otherwise use Yahoo solution (launch the search engine inside
Acrobat reader -
http://partners.adobe.com/public/developer/en/acrobat/PDFOpenParameters.pdf
/ search parameters).

I know these are only solutions for PDFs but that's the format I'm working
on right now and I think its use is widespread so it might be useful to
implement these features.

Stephan


On Wed, March 23, 2005 11:19, Andrzej Bialecki said:
> John X wrote:
>> Hi, All,
>>
>> Attached please find servlet Cached.java that serves raw Content
>> of any mime type. Current cached.jsp handles mime type text/* only.
>> If no objection, it is going to be committed in a few days.
>
> I think this would be quite useful.
>
> However, what I think is ultimately needed to match the features of
> other search engines is not the ability to return the cached non-html
> content (there might even be copyright issues with this function...),
> but an html rendering of non-html content, a la Google's "View as HTML"
> function.
>
> --
> Best regards,
> Andrzej Bialecki
>   ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>




-------------------------------------------------------
This SF.net email is sponsored by: 2005 Windows Mobile Application Contest
Submit applications for Windows Mobile(tm)-based Pocket PCs or Smartphones
for the chance to win $25,000 and application distribution. Enter today at
http://ads.osdn.com/?ad_id=6882&alloc_id=15148&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to