Hi guys. I need some help in excluding specific files from crawling.
I am trying to crawl Windows file server using Windows shares connector to index to Solr. There are some files I do not want to index so I set paths to exclude them from crawling, but the job crawls them. For example, I do NOT want to index "text.txt" in a directory D which is a root path. In "Paths" tab: - Set D as the root path. - To create crawling rules, from pulldown, chose "exclude" and "file", and enter "text.txt" in a text box. - The list of crawling rules is created as following: 1. Exclude file(s) matching text.txt 2. Include indexable file(s) matching * 3. Include directory(s) matching * - Save the job setting As the result, the job still tries to crawl the file. I wonder why "text.txt" does not match in the crawling rule. Anyone knows what I did wrong? Version: MCF 0.5 Solr 3.5 MySql 5.5 Regards, Shigeki
