HI ,
I was trying to test a scenario in nutch.
Scenario - I have a page P1 which has content C1.
I have indexed it using bin/nutch ..
I have redeployed nutch and on searching I am able to
search C1.
Now in the same page P1 I have changed content from C1 to
C1,C2 .
I have recrawled the web application.
But
If I search for C2 I am not able to get the page.
If I search for C1 I am able to get the page but
the content is the old content i.e. C1 only.
I assume the reason for this problem is
db.default.fetch.interval set to 30 which is the number of days after which
the refetch is to happen.
If I want to crawl the site after every 1 hour how can i
do it.I am using nutch-0.9 .I have also tried floating values like 15f .. .
Please give your inputs .
Regards,
Rinesh
--
View this message in context:
http://www.nabble.com/Recrawling-updated-pages-tp21228900p21228900.html
Sent from the Nutch - User mailing list archive at Nabble.com.