Hi Remi,
Thanks a lot for the quick response. That's what I expect as well -
pages being temporarily down and therefore not available to crawl and
index. I'll check the logs and see what I can figure out.
Thanks,
Elisabeth
On 27.03.2012 12:05, remi tassing wrote:
This happened to me before for a very specific reason and I'm not sure if
it's the same for you. Some of the websites I was trying to access
were temporarily down.
I would suggest you check the difference between the logs
Remi
On Tue, Mar 27, 2012 at 4:28 PM, Elisabeth Adler
<[email protected]>wrote:
Hi,
I'm using Nutch 1.3 to crawl dynamic pages (JSPs) and indexing them into
Solr. With the same settings, I sometimes get more documents indexed,
sometimes less. There are no errors in the log files. The Solr index and
Nutch crawl directory are removed before each crawl, so I have a clean
setup.
What could be the reason for these differences?
Any pointers appreciated,
Elisabeth