[Nutch-general] How to get page content given URL only?

Nguyen Ngoc Giang Fri, 09 Dec 2005 00:25:04 -0800

  Hi everyone,

  I'm writing a small program which just utilizes Nutch as a crawler only,
with no search functionality. The program should be able to return page
content given an url input. I would like to ask how can we get the page
content given only the URL, since webdb only provides a mechanism to get
meta data of a page given URL, while segments can read content but require a
record number.


  Any help is greatly appreciated.

  Best regards,
  Giang

[Nutch-general] How to get page content given URL only?

Reply via email to