Hi Steve, A ticket was recently opened and worked on which (I believe) covers this. See:
https://issues.apache.org/jira/browse/CONNECTORS-886 The fix for this has been committed to trunk (except for the ElasticSearch component support). If you are in a position to try this out against your domain, you could confirm that it works as planned. Thanks, Karl On Fri, Feb 21, 2014 at 1:26 PM, Steve Kearns <[email protected]> wrote: > Hi, > > I am setting up a crawler for a Windows Share (CIFS), with output to a > Solr 4 index. > > I was able to get things up and running quite well -- thanks for the great > documentation, it has all worked as expected, with one rather nuanced > question around security. > > My internal users are quite security conscious, and some of them have > raised the question of how security works with regard to directory > traversal permissions. > > Here's an example to illustrate: > > \\server\Folder1\Folder2\Foo.txt > > The share allows Domain Users to connect. > Folder1 also allows Domain Users to read. > Folder2 does not inherit permissions from Folder1, and only the user > Admin1 has read/write permissions. Domain Users do have permissions to > traverse the folder. > Foo.txt has explicit permissions that enable Domain Users to read and > write the file. > > > In a modern Active Directory, there is a Group Policy Object (GPO) setting > called "Bypass Traverse," which is granted to *Everyone* by default. This > setting causes windows ACL security checks to ignore whether a user has > traversal rights on a folder, and looks only at the file itself. > However, it's not present pre-Windows7 and it can be disabled for specific > groups, so let's assume that Domain Users does not have the Bypass Traverse > setting. > > If this is the case, then users in the Domain Users group would not be > able to see and open Foo.txt, even though they have explicit RW permissions > on it -- Windows would see that they don't have the rights to traverse the > folder leading to the file, and it will deny access. > > > > Now on to the questions: > > 1. Is this a common scenario? In other words, do other users here worry > about this directory traverse setting when crawling/quering CIFS shares and > Windows folders? How do others handle this? > > 2. If Bypass Traverse is allowed, a Domain User could read Foo.txt, but > only if they knew the explicit path. By indexing it for search and > returning it in search results, the user now knows something they may not > have known before. This seems like a potential security issue, do you > agree? > > 3. Is there a way to configure Apache ManifoldCF to perform traversal > checking for CIFS shares and Windows folders? > > > Thanks in advance! > Steve > > > >
