Hi, I am setting up a crawler for a Windows Share (CIFS), with output to a Solr 4 index.
I was able to get things up and running quite well -- thanks for the great documentation, it has all worked as expected, with one rather nuanced question around security. My internal users are quite security conscious, and some of them have raised the question of how security works with regard to directory traversal permissions. Here's an example to illustrate: \\server\Folder1\Folder2\Foo.txt The share allows Domain Users to connect. Folder1 also allows Domain Users to read. Folder2 does not inherit permissions from Folder1, and only the user Admin1 has read/write permissions. Domain Users do have permissions to traverse the folder. Foo.txt has explicit permissions that enable Domain Users to read and write the file. In a modern Active Directory, there is a Group Policy Object (GPO) setting called "Bypass Traverse," which is granted to *Everyone* by default. This setting causes windows ACL security checks to ignore whether a user has traversal rights on a folder, and looks only at the file itself. However, it's not present pre-Windows7 and it can be disabled for specific groups, so let's assume that Domain Users does not have the Bypass Traverse setting. If this is the case, then users in the Domain Users group would not be able to see and open Foo.txt, even though they have explicit RW permissions on it -- Windows would see that they don't have the rights to traverse the folder leading to the file, and it will deny access. Now on to the questions: 1. Is this a common scenario? In other words, do other users here worry about this directory traverse setting when crawling/quering CIFS shares and Windows folders? How do others handle this? 2. If Bypass Traverse is allowed, a Domain User could read Foo.txt, but only if they knew the explicit path. By indexing it for search and returning it in search results, the user now knows something they may not have known before. This seems like a potential security issue, do you agree? 3. Is there a way to configure Apache ManifoldCF to perform traversal checking for CIFS shares and Windows folders? Thanks in advance! Steve
