Hi All,
I am using nutch 0.9.
I want to crawl the webpage in a manner that it should give me the no.
of links and the corresponding links in that webpage.
But nutch is doing all the things like creating webdb, a set of
segments, and the index.
I have to calculate the time that how much time nutch is taking to crawl
a webpage in comparison to other crawlers.
For example,
Input - http://localhost:8080/HTML/1.html
output - no. of links in 1.html
I want to achieve this functionality, can it be possible with nutch.
Thanks & Regards,
Naveen Goswami
91 9899547886
The information contained in this electronic message and any attachments to
this message are intended for the exclusive use of the addressee(s) and may
contain proprietary, confidential or privileged information. If you are not the
intended recipient, you should not disseminate, distribute or copy this e-mail.
Please notify the sender immediately and destroy all copies of this message and
any attachments.
WARNING: Computer viruses can be transmitted via email. The recipient should
check this email and any attachments for the presence of viruses. The company
accepts no liability for any damage caused by any virus transmitted by this
email.
www.wipro.com