Hi everybody, I face a problem when using nutch. I use nuth to crawl in intranet. It works well before. But recently, I add some urls to crawl. These urls ara different with normal .The new urls like this: http://compass.mydomain.com/go/247460034
there are many folders or documents under this url, such as folder: http://compass.mot.com/go/247460034/2354342276 documents: http://compass.mot.com/go/247460034/mydoc.pdf After crawl, the docs under this kind of urls can not be searched, I check the log, I find when crawling this kind of urls can be fetched ,but they were not indexed. I don't know why. Can you tell how to do? regards, Gong Zhao
