Hi guys.

I need some help in excluding specific files from crawling.

I am trying to crawl Windows file server using Windows shares connector to
index to Solr.
There are some files I do not want to index so I set paths to exclude them
from crawling, but the job crawls them.

For example, I do NOT want to index "text.txt" in a directory D which is a
root path.

In "Paths" tab:
- Set D as the root path.
- To create crawling rules, from pulldown, chose "exclude" and "file", and
enter "text.txt" in a text box.
- The list of crawling rules is created as following:

  1. Exclude file(s) matching text.txt
  2. Include indexable file(s) matching *
  3. Include directory(s) matching *

- Save the job setting

As the result, the job still tries to crawl the file.
I wonder why "text.txt" does not match in the crawling rule.

Anyone knows what I did wrong?

Version:
  MCF 0.5
  Solr 3.5
  MySql 5.5


Regards,

Shigeki

Reply via email to