Hi I am using Nutch 2.1 with MySQL. The requirement is to crawl all the Paginated web pages.
Say, for example, if I had given the Seed URL as the first page (page no:1 ) of some website (http://x.com?num=1) and by giving appropriate regular expression through URL filter to make nutch to crawl the pages with the pattern as "num" Nutc able to crawl the given URLs http://x.com?num=2 http://x.com?num=3 ... Nutch is successfully crawling if the pagination URL is given in the anchor tag(a href) for pagination. I was facing issue when the web pages had used some java script function to call the pagination by calling function like onPaginationSubmit() Nutch was not able to take crawl those pages. can anyone help to give solution on how to crawl those paginated pages? Thanks and Regards Deepa Devi =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you

