On Wed, Mar 23, 2005 at 11:53:21AM +0100, Stephan Lagraulet wrote: > Hi! > We could do this for certain type of documents. > But for PDF files, I think we should use a new feature provided by PDFBox, > PdfHighlighter. > This is actually using an Acrobat feature described here : > http://partners.adobe.com/public/developer/en/pdf/HighlightFileFormat.pdf > > When the user selects the link "View cache" or "View highlight", we could > generate the XML highlight file and use it to highlight the hits directly > inside the PDF. > That's even better than Google cache... > We could otherwise use Yahoo solution (launch the search engine inside > Acrobat reader - > http://partners.adobe.com/public/developer/en/acrobat/PDFOpenParameters.pdf > / search parameters). > > I know these are only solutions for PDFs but that's the format I'm working > on right now and I think its use is widespread so it might be useful to > implement these features.
Could you provide a code snippet or better a patch? Thanks, John > > Stephan > > > On Wed, March 23, 2005 11:19, Andrzej Bialecki said: > > John X wrote: > >> Hi, All, > >> > >> Attached please find servlet Cached.java that serves raw Content > >> of any mime type. Current cached.jsp handles mime type text/* only. > >> If no objection, it is going to be committed in a few days. > > > > I think this would be quite useful. > > > > However, what I think is ultimately needed to match the features of > > other search engines is not the ability to return the cached non-html > > content (there might even be copyright issues with this function...), > > but an html rendering of non-html content, a la Google's "View as HTML" > > function. > > > > -- > > Best regards, > > Andrzej Bialecki > > ___. ___ ___ ___ _ _ __________________________________ > > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > > ___|||__|| \| || | Embedded Unix, System Integration > > http://www.sigram.com Contact: info at sigram dot com > > > > > > > __________________________________________ http://www.neasys.com - A Good Place to Be Come to visit us today! ------------------------------------------------------- This SF.net email is sponsored by Microsoft Mobile & Embedded DevCon 2005 Attend MEDC 2005 May 9-12 in Vegas. Learn more about the latest Windows Embedded(r) & Windows Mobile(tm) platforms, applications & content. Register by 3/29 & save $300 http://ads.osdn.com/?ad_id=6883&alloc_id=15149&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
