Dawid Weiss wrote:

Ok, the workaround gives me a raw text of the page, I don't need it. I
need the textual representation. So the previous question still holds:

Can you show me how to get hold of a textual representation of a hit
(or its summary)?



It looks like some modifications to Nutch will be necessary... I remember a similar discussion on Lucene highlighting.


The places which require modifications, are Summary.Highlight.toString(), Summary.Ellipsis.toString(), and Summary.toString() - I think that adding a method Summary.toString(String hiStart, String hiEnd), which specifies the highlight string, and reimplementing Summary.toString() as Summary.toString("<b>", "</b>") would do the trick....

--
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)




------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to