nutch fetched but no indexed

宫照 Wed, 23 Jul 2008 20:27:48 -0700

Hi everybody，

I face a problem when using nutch. I use nuth to crawl in intranet. It works
well before. But recently, I add some urls to crawl. These urls ara
different with normal .The new urls like this:
http://compass.mydomain.com/go/247460034


there are many folders or documents under this url, such as folder:
http://compass.mot.com/go/247460034/2354342276
documents:
http://compass.mot.com/go/247460034/mydoc.pdf

After crawl, the docs under this kind of urls can not be searched,
I check the log, I find when crawling  this kind of urls can be fetched ,but
they were not indexed.

I don't know why. Can you tell how to do?

regards,

Gong Zhao

nutch fetched but no indexed

Reply via email to