Hi all, I'm having same trouble trying to carawl and recrawl my local filesystem. I'm using the script posted at http://wiki.apache.org/nutch/IntranetRecrawl
My filesystem is made like this: ../ ../first/ ../first/file1.pdf ../first/second/ ../first/second/file2.pdf ../first/second/third ../first/second/third/file2.pdf ../first/second/third/fourth/ ../first/second/third/fourth/file4.pdf ../first/second/third/fourth/fifth ../first/second/third/fourth/fifth/file5.pdf On the first crawl "round" everything seems fine....it stops at the "first" directory (depth 1) On the first recrawl(depth 3) it stops at the "third" directory and all the files seem indexed correctly. On the second recrawl(always depth 3) it arrives at the fifth diretory but none of the files are indexed. any idea? thanks Luca ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers