Re: document level security: indexing/searching techniques

2010-07-06 Thread osocurious2

Someone else was recently asking a similar question (or maybe it was you but
worded differently :) ).

Putting user level security at a document level seems like a recipe for
pain. Solr/Lucene don't do frequent update well...and being highly optimized
for query, I don't blame them. Is there any way to create a series of roles
that you can apply to your documents? If the security level of the document
isn't changing, just the user access to them, give the docs a role in the
index, put your user/usergroup stuff in a DB or some other system and
resolve your user into valid roles, then FilterQuery on role.  
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/document-level-security-indexing-searching-techniques-tp946528p946649.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: document level security: indexing/searching techniques

2010-07-06 Thread Ken Krugler


On Jul 6, 2010, at 8:27am, osocurious2 wrote:



Someone else was recently asking a similar question (or maybe it was  
you but

worded differently :) ).

Putting user level security at a document level seems like a recipe  
for
pain. Solr/Lucene don't do frequent update well...and being highly  
optimized
for query, I don't blame them. Is there any way to create a series  
of roles
that you can apply to your documents? If the security level of the  
document
isn't changing, just the user access to them, give the docs a role  
in the

index, put your user/usergroup stuff in a DB or some other system and
resolve your user into valid roles, then FilterQuery on role.


You're right, baking in too fine-grained a level of security  
information is a bad idea.


As one example that worked pretty well for code search with Krugle, we  
set access control on a per project level using LDAP groups - ie each  
project had some number of groups that were granted access rights.  
Each file in the project would inherit the same list of groups.


Then, when a user logs in they get authenticated via LDAP, and we have  
the set of groups they belong to being returned by the LDAP server.  
This then becomes a fairly well-bounded list of terms for an OR  
query against the acl-groups field in each file/project document.  
Just don't forget to set the boost to 0 for that portion of the query :)


-- Ken


Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Re: document level security: indexing/searching techniques

2010-07-06 Thread Peter Sturge
Yes, you don't want to hard code permissions into your index - it will give
you headaches.

You might want to have a look at SOLR 1872:
https://issues.apache.org/jira/browse/SOLR-1872 .
This patch provides doc level security through an external ACL mechanism (in
this case, an XML file) controlling a filter query,
This way, you don't need to change the schema - you can even use existing
indexes, and you can change access control without affecting your stored
data.

HTH,
Peter


On Tue, Jul 6, 2010 at 5:16 PM, Ken Krugler kkrugler_li...@transpac.comwrote:


 On Jul 6, 2010, at 8:27am, osocurious2 wrote:


 Someone else was recently asking a similar question (or maybe it was you
 but
 worded differently :) ).

 Putting user level security at a document level seems like a recipe for
 pain. Solr/Lucene don't do frequent update well...and being highly
 optimized
 for query, I don't blame them. Is there any way to create a series of
 roles
 that you can apply to your documents? If the security level of the
 document
 isn't changing, just the user access to them, give the docs a role in the
 index, put your user/usergroup stuff in a DB or some other system and
 resolve your user into valid roles, then FilterQuery on role.


 You're right, baking in too fine-grained a level of security information is
 a bad idea.

 As one example that worked pretty well for code search with Krugle, we set
 access control on a per project level using LDAP groups - ie each project
 had some number of groups that were granted access rights. Each file in the
 project would inherit the same list of groups.

 Then, when a user logs in they get authenticated via LDAP, and we have the
 set of groups they belong to being returned by the LDAP server. This then
 becomes a fairly well-bounded list of terms for an OR query against the
 acl-groups field in each file/project document. Just don't forget to set
 the boost to 0 for that portion of the query :)

 -- Ken

 
 Ken Krugler
 +1 530-210-6378
 http://bixolabs.com
 e l a s t i c   w e b   m i n i n g







Re: document level security: indexing/searching techniques

2010-07-06 Thread Lance Norskog
What Ken describes is called 'role-based' security. Users have roles,
and security items talk about roles, not users.

http://en.wikipedia.org/wiki/Role-based_access_control

On Tue, Jul 6, 2010 at 3:15 PM, Peter Sturge peter.stu...@gmail.com wrote:
 Yes, you don't want to hard code permissions into your index - it will give
 you headaches.

 You might want to have a look at SOLR 1872:
 https://issues.apache.org/jira/browse/SOLR-1872 .
 This patch provides doc level security through an external ACL mechanism (in
 this case, an XML file) controlling a filter query,
 This way, you don't need to change the schema - you can even use existing
 indexes, and you can change access control without affecting your stored
 data.

 HTH,
 Peter


 On Tue, Jul 6, 2010 at 5:16 PM, Ken Krugler 
 kkrugler_li...@transpac.comwrote:


 On Jul 6, 2010, at 8:27am, osocurious2 wrote:


 Someone else was recently asking a similar question (or maybe it was you
 but
 worded differently :) ).

 Putting user level security at a document level seems like a recipe for
 pain. Solr/Lucene don't do frequent update well...and being highly
 optimized
 for query, I don't blame them. Is there any way to create a series of
 roles
 that you can apply to your documents? If the security level of the
 document
 isn't changing, just the user access to them, give the docs a role in the
 index, put your user/usergroup stuff in a DB or some other system and
 resolve your user into valid roles, then FilterQuery on role.


 You're right, baking in too fine-grained a level of security information is
 a bad idea.

 As one example that worked pretty well for code search with Krugle, we set
 access control on a per project level using LDAP groups - ie each project
 had some number of groups that were granted access rights. Each file in the
 project would inherit the same list of groups.

 Then, when a user logs in they get authenticated via LDAP, and we have the
 set of groups they belong to being returned by the LDAP server. This then
 becomes a fairly well-bounded list of terms for an OR query against the
 acl-groups field in each file/project document. Just don't forget to set
 the boost to 0 for that portion of the query :)

 -- Ken

 
 Ken Krugler
 +1 530-210-6378
 http://bixolabs.com
 e l a s t i c   w e b   m i n i n g









-- 
Lance Norskog
goks...@gmail.com


Re: document level security: indexing/searching techniques

2010-07-06 Thread Glen Newton
You could implement a good solution with the underlying Lucene ParallelReader
http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/index/ParallelReader.html
Keep the 100 search fields - 'static' info - in one index, the
permissions info in another index that gets updated when the
permissions change.

Does SOLR expose this kind of functionality?

-Glen Newton
http://zzzoot.blogspot.com/
http://zzzoot.blogspot.com/2009/07/project-torngat-building-large-scale.html

On 7 July 2010 00:38, RL rl.subscri...@gmail.com wrote:

 I've a question about indexing/searching techniques in relation to document
 level security.
 In planning a system that has, let's say, about 1million search documents
 with about 100 search fields each. Most of them unstored to keep the index
 size low, because some of them can contain some kilobytes and some of them
 several hundred kilobytes. Two of these search fields are for permission
 checking, where i keep the explicitely allowed and explicitely disallowed
 users and usergroups. (usergroups can be in a hierarchical structure with
 permission inheritance)

 So when a user searches in the system, his user id, and ids of usergroup
 memberships are added as a filter query in my application logic before the
 query is sent to solr. So far so good for the searching part.

 But the problem is, that the permissions can be changed by administrators of
 that system, requiring to re-index the two permission search fields.

 first idea:
 Partial updates of index entries is not possible, so i need to fetch all the
 1million documents from a database to do a re-indexing just because some
 permissions changed. The fetching process is rather expensive and requires
 more then 14hours. I am sure that this can be optimized of course, but i
 would rather try to avoid a whole re-indexing of all content.

 second idea:
 Another idea would be to store just the permissions in one small and fast to
 update index and all the other stuff in the other huge and not so often
 updated index. But i didn't find any possibilities to combine these two
 indices in one query. Is that even possible?


 Does somebody have experience with these topics or give advice how to solve
 that case properly?
 Thanks in advance.

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/document-level-security-indexing-searching-techniques-tp946528p946528.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 

-