Hi Karl, What timing! Thanks for the quick reply.
I just pulled trunk from SVN, and after a fresh build and fresh Solr install (with the 2 new fields in the schema), I have had no luck in seeing the new fields ( [allow,deny]_token_parent) populated in Solr when doing a Windows Share crawl. Is there an extra setting needed to the Windows Share connector to make this happen? In a simple test, I don't see the new fields being given anything other than their default value, so I'm worried that I did something wrong. Here was the directory structure I crawled: \\<server>\share - Share-level access to Domain Users and \\<server>\share - ACLs set to allow access to Everyone. \\<server>\share\FileEveryoneCanRead.txt - inherits ACLs from parent folder. \\<server>\share\folder - Explicitly do *not* inherit permissions. Allow only domain administrator account to read/write. \\<server>\share\folder\Test.txt - Inherit from parent directory, and also explicitly allow Domain Users to Read/Write. Here's when this matters: Windows shares, run from a modern Windows server with default settings, will let Domain Users read/write test.txt, because of the "Bypass Traverse" GPO setting, which ignores the fact that Domain Users don't have directory traverse privileges on \\<server>\share\folder. In some environments, traversal is enforced, and the flattened security settings can't account for it. I was hoping that there would be a way for me to index "Directory Traverse" permissions for each level of directory between the share root and a given file, and enforce it at query time. I'm not sure that is what CONNETORS-886 is intended to do, is it? Thanks! Steve On Mon, Feb 24, 2014 at 8:23 AM, Karl Wright <[email protected]> wrote: > Hi Steve, > > Work on the CONNECTORS-886 ticket is now completed. It would be great to > try this out on your particular CIFS setup to be sure it properly captures > your particular security situation. If you are willing, this is how to do > that: > > - Check out https://svn.apache.org/repos/asf/manifoldcf/trunk > - Build it: ant make-core-deps make-deps build > - Download the appropriate plugin release candidate, either from > http://people.apache.org/~kwright/apache-manifoldcf-solr-3.x-plugin-2.0or > frome > http://people.apache.org/~kwright/apache-manifoldcf-solr-4.x-plugin-2.0 > - Install the plugin on your Solr instance, being sure to configure all > SIX fields that it requires > - Run ManifoldCF and try indexing content that is protected via path to > that Solr instance, and see whether security is appropriately enforced > - Let us know what happens! > > Thanks for all your help! > > Karl > > > > On Fri, Feb 21, 2014 at 2:29 PM, Karl Wright <[email protected]> wrote: > >> Hi Steve, >> A ticket was recently opened and worked on which (I believe) covers >> this. See: >> >> https://issues.apache.org/jira/browse/CONNECTORS-886 >> >> The fix for this has been committed to trunk (except for the >> ElasticSearch component support). If you are in a position to try this out >> against your domain, you could confirm that it works as planned. >> >> Thanks, >> Karl >> >> >> On Fri, Feb 21, 2014 at 1:26 PM, Steve Kearns <[email protected]> wrote: >> >>> Hi, >>> >>> I am setting up a crawler for a Windows Share (CIFS), with output to a >>> Solr 4 index. >>> >>> I was able to get things up and running quite well -- thanks for the >>> great documentation, it has all worked as expected, with one rather nuanced >>> question around security. >>> >>> My internal users are quite security conscious, and some of them have >>> raised the question of how security works with regard to directory >>> traversal permissions. >>> >>> Here's an example to illustrate: >>> >>> \\server\Folder1\Folder2\Foo.txt >>> >>> The share allows Domain Users to connect. >>> Folder1 also allows Domain Users to read. >>> Folder2 does not inherit permissions from Folder1, and only the user >>> Admin1 has read/write permissions. Domain Users do have permissions to >>> traverse the folder. >>> Foo.txt has explicit permissions that enable Domain Users to read and >>> write the file. >>> >>> >>> In a modern Active Directory, there is a Group Policy Object (GPO) >>> setting called "Bypass Traverse," which is granted to *Everyone* by >>> default. This setting causes windows ACL security checks to ignore whether >>> a user has traversal rights on a folder, and looks only at the file >>> itself. >>> However, it's not present pre-Windows7 and it can be disabled for >>> specific groups, so let's assume that Domain Users does not have the Bypass >>> Traverse setting. >>> >>> If this is the case, then users in the Domain Users group would not be >>> able to see and open Foo.txt, even though they have explicit RW permissions >>> on it -- Windows would see that they don't have the rights to traverse the >>> folder leading to the file, and it will deny access. >>> >>> >>> >>> Now on to the questions: >>> >>> 1. Is this a common scenario? In other words, do other users here worry >>> about this directory traverse setting when crawling/quering CIFS shares and >>> Windows folders? How do others handle this? >>> >>> 2. If Bypass Traverse is allowed, a Domain User could read Foo.txt, but >>> only if they knew the explicit path. By indexing it for search and >>> returning it in search results, the user now knows something they may not >>> have known before. This seems like a potential security issue, do you >>> agree? >>> >>> 3. Is there a way to configure Apache ManifoldCF to perform traversal >>> checking for CIFS shares and Windows folders? >>> >>> >>> Thanks in advance! >>> Steve >>> >>> >>> >>> >> >
