Hi there,
I remember somehow a comment from Andrzej that one of the http protocoll plugins now is able to download external files like java script belong to the page itself. But I was not able to find this comment again, is that comment done or do I mix things? Anyway I'm looking for a way to download external items that belongs to a html page, like images, java script files or css files. Since this require parsing the page anyway I was thinking I could create a new fetchlist, fetch the content and than merge them together again.. using map reduce. But on the other hand it would be good to 'reuse' the thread that already connected to the host to download the items.

Any comments and ideas?

Thanks!
Stefan


-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.
Get Certified Today * Register for a JBoss Training Course
Free Certification Exam for All Training Attendees Through End of 2005
Visit http://www.jboss.com/services/certification for more information
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to