Hi all,

I am currently using nutch to crawl all the jira issues we have. Have anyone 
done this before? Nutch crawls most of the issues but I am still missing some 
issues from jira. THis is the two urls I put in seeds.txt:
1. https://our jira/secure/Dashboard.jspa

2. https://our jira/secure/BrowseProjects.jspa#all
Either these two urls are not enough, or I am guessing that the 
db.fetch.interval.default in nutch-site.xml is not appropriate so it didn't 
crawl all the pages(currently I am define this as 86400s). Anyone has any ideas?

Thanks,
Joshua

Reply via email to