Does the following concept sound reasonable? * The Lucene index generator would run under a windows account that has full read access to all files stored on the NTFS file server. * For each file, the following information would have to be extracted: - the contents of the file and, - the access rights of the file (which windows users / groups have read access to the file)
All of this information is stored within the Document index. Once the index is complete and a user logs into the system then the following would occur: * The user's access rights would be read from Active Directory (i.e windows group membership, etc) * On the submission of a query to Lucene - the user / group access rights would be appended as required search criteria and Lucene would filter out all results that the user shouldn't see. Is this the correct approach to take with Lucene? And does anybody know how to extract user/group access right from Files lying on an NTFS drive? Ok, I know this is more of a general Java question - but I'd be interested in hearing from anyone who has faced a similar requirement and how they solved it. -----Original Message----- From: Peter Veentjer - Anchor Men [mailto:[EMAIL PROTECTED] Sent: 13 April 2005 13:56 To: java-user@lucene.apache.org Subject: RE: Searching an NTFS File Server You can use a filter with the IndexSearcher so that it removes all 'unwanted' results. -----Original Message----- From: Maher Martin Sent: 13 April 2005 12:39 To: java-user@lucene.apache.org Subject: Searching an NTFS File Server Hello all, We're currently evaluating search tools to cover the following requirement: We have an NTFS file server with 2 TB of files (word, excel, pdf, txt, etc). We would like to index all these files and integrate a simple web application into our intranet which will allow users to login using their windows credentials and search through this file index, returning only hits to files matching the users search criteria, to which the user has the *necessary rights* to view. My questions are: - How does one generate the list of results, so that the list contains only entries that the user may view? Should the returned list of results be post processed to filter out the invalid entries, and if so how? - Is 2 TB of data to large to be handled by Lucene? - Has anybody implemented a similar type of application? Thanks in advance Martin --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]