I have a paginated pages, which will only work if its crawled in a given sequence, and in the same session.
For example first URL is http://www.myhost.com/?page_number=1 http://www.myhost.com/?page_number=2 http://www.myhost.com/?page_number=3 The first page has link to second page. Second page has link to first and second page. Third page has link to third and second page. So On... Nutch is able to crawl the the first 6 pages, but beyond that it is not able to crawl or is getting empty result. If I manually click through the pagination, in a browser, I can reach till the end with no problem. Is the Nutch Crawl Session timing out? How do we increase it. I tried crawling with on thread but still same result. Any suggestion ? --- Thanks/Regards, Parvez