It depends on what you are trying to do. Content in segments stores the full content (html, etc.) of each page. The cached.jsp page displays full content.

Dennis Kubes


LoneEagle70 wrote:
Hi,

I was able to install Nutch 0.9 and crawl a site and use the Web Page to do
full text search of my db.

But we need to extract informations from all HTML page.

So, is there a way to extract HTML pages from the db?

Reply via email to