Evaluating Nutch - Some questions

LoneEagle70 Wed, 17 Oct 2007 13:53:35 -0700

Hi,

We want to know if Nutch could be used for our project:


1) While browsing Some sites requires the user to provide information such
as 'Country, Zip Code, Language'.
How should this information be handle ?

2) Dynamic links through javascript or form submit:
We need site specific rules to build the list of subsequent pages that
should be visited from a given page.

For example, many sites have an option list which should be selected prior
to moving to the next page.
Each option in the list goes to a different page.

On such a site, the rule would be: Subsequent pages are obtained by looping
though option field "z"
and building url=urlprefix + <value of z> + urlsuffix

How should this be handle ?

3) Once we have a page, how can we extract specific information?
If an element of interest is an image file, How can we download the image
file ?

4) We want to store the information gathered into our own PostgreSQL
database.
Do we need the Nutch database, can it be disabled ?

If it's needed to control the urls walkthrough, can it be setup not to save
pages content?

Can we disable the indexing step ?
-- 
View this message in context: 
http://www.nabble.com/Evaluating-Nutch---Some-questions-tf4643083.html#a13262171
Sent from the Nutch - User mailing list archive at Nabble.com.

Evaluating Nutch - Some questions

Reply via email to