Hi,
have you checked the filters? (regex-urlfilter or crawl-urlfilter)? The
ending ".ppt" ist disabled by default.
Regards
Michael
Ayyanar Inbamohan wrote:
Hi all,
I am using the powerpoint plugin from JIRA, and when i
crawl my application having link to the ppt, nutch 7.0
is not at all fetching the powerpoint files.
i am crawling my local appliation
http://localhost:8080/search_sample/index.html
this url, i have given in the url.intranet,
i gave some href to powerpoint file in index.html,
and then started but it is not crawling
Thanks in advance..
thanks,
Ayyanar....
--
Michael Nebel
http://www.nebel.de/
http://www.netluchs.de/
-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general