Re: Selective Searches Based on User Identity
Terence Gannon schrieb: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3) deniedUid (uid of users specifically denied access to the document). Grants might change quite a bit, the owner will likely remain the same. Wouldn't it be better to include only the owner in the document and store grants someplace else, like in an RDBMS or - if you don't want one - a lightweight embedded database like BDB? That way you could have your application tag an ineluctable filter query onto each and every user query, which would ensure to include only those documents in the results the owner of which has granted the user access. Considering that I'm a Solr/Lucene newbie, this approach might have a disadvantage that escapes me, which is why other people haven't made this particular suggestion. If so, I'd be happy to learn why this isn't preferable. Michael Ludwig
RE: Selective Searches Based on User Identity
Yes, the ownerUid will likely be assigned once and never changed. But you still need it, in order to keep track of who has contributed which document. I've been going over some of the simpler query scenarios, and Solr is capable of handling them without having to resort to an external RDBMS. In order to limit documents to those which a given user owns, or those to which he has been granted access, the syntax fragment would be something like; ownerUid:ab2734 or grantedUid:ab2734 where abs2734 is the uid for the user doing the query. However, I'm less comfortable with more complex query scenarios, particularly if the concept of groups is eventually introduced, which is likely in my scenario. In the latter case, it may be necessary to use an external RDBMS. I'll plead ignorance of the 'ineluctable filter query' and will have to read up on that one. With respect to updates to rights, they are not likely to be that frequent, but when they are, they entire document will have to be reindexed rather than simply updating the grantedUid and/or deniedUid fields. I don't believe Solr supports the updating of individual fields, at least not yet. This may be another reason to eventually go to an external RDBMS. Thanks very much for your help! Terence -Original Message- From: Michael Ludwig Sent: May 13, 2009 05:27 To: solr-user@lucene.apache.org Subject: Re: Selective Searches Based on User Identity Terence Gannon schrieb: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3) deniedUid (uid of users specifically denied access to the document). Grants might change quite a bit, the owner will likely remain the same. Wouldn't it be better to include only the owner in the document and store grants someplace else, like in an RDBMS or - if you don't want one - a lightweight embedded database like BDB? That way you could have your application tag an ineluctable filter query onto each and every user query, which would ensure to include only those documents in the results the owner of which has granted the user access. Considering that I'm a Solr/Lucene newbie, this approach might have a disadvantage that escapes me, which is why other people haven't made this particular suggestion. If so, I'd be happy to learn why this isn't preferable. Michael Ludwig
Re: Selective Searches Based on User Identity
Hi Terence, Terence Gannon schrieb: Yes, the ownerUid will likely be assigned once and never changed. But you still need it, in order to keep track of who has contributed which document. Yes, of course! I've been going over some of the simpler query scenarios, and Solr is capable of handling them without having to resort to an external RDBMS. The database is only to store grants - it's not to help with searching. It would look like this: grantee| grant ---+-- fritz | fred,frank,egon frank | egon,fritz egon | terence,frank ... Each user is granted to access to his own documents and to those he had received grants for. In order to limit documents to those which a given user owns, or those to which he has been granted access, the syntax fragment would be something like; ownerUid:ab2734 or grantedUid:ab2734 I think it could be: ownerUid:egon OR ownerUid:terence OR ownerUid:frank No need to embed grants in the document. Ah, I see my mistake now. You want grants based on the document, not on the user - I had overlooked that fact. That makes my suggestion invalid. I'll plead ignorance of the 'ineluctable filter query' and will have to read up on that one. I meant a filter query that the application tags onto the query on behalf of the user and without the user being able to do anything about it so he cannot circumvent the filter. Best regards, Michael Ludwig
RE: Selective Searches Based on User Identity
Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3) deniedUid (uid of users specifically denied access to the document). These fields, coupled with some business rules around how they were populated should cover off all possibilities I think. Access to the Solr instance would have to be tightly controlled, but that's something that should be done anyway. You sure wouldn't want end users preparing their own XML and throwing it at Solr -- it would be pretty easy to figure out how to get around the access/denied fields and get at stuff the owner didn't intend. This approach mimics to some degree what is being done in the operating system, but it's still elegant and provides the level of control required. Anybody else have any thoughts in this regard? Has anybody implemented anything similar, and if so, how did it work? Thanks, and best regards... Terence
Re: Selective Searches Based on User Identity
I also work with the FAST Enterprise Search engine and this is exactly how their Security Access Module works. They actually use a modified base-32 encoded value for indexing, but that is because they don't have the luxury of untokenized/un-processed String fields like Solr. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 12:26 PM, Terence Gannon wrote: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3) deniedUid (uid of users specifically denied access to the document). These fields, coupled with some business rules around how they were populated should cover off all possibilities I think. Access to the Solr instance would have to be tightly controlled, but that's something that should be done anyway. You sure wouldn't want end users preparing their own XML and throwing it at Solr -- it would be pretty easy to figure out how to get around the access/denied fields and get at stuff the owner didn't intend. This approach mimics to some degree what is being done in the operating system, but it's still elegant and provides the level of control required. Anybody else have any thoughts in this regard? Has anybody implemented anything similar, and if so, how did it work? Thanks, and best regards... Terence
Re: Selective Searches Based on User Identity
The only downside would be that you would have to update a document anytime a user was granted or denied access. You would have to query before the update to get the current values for grantedUID and deniedUID, remove/add values, and update the index. If you don't have a lot of changes in the system that wouldn't be a big deal, but if a lot of changes are happening throughout the day you might have to queue requests and batch them. -Jay On Tue, May 12, 2009 at 1:05 PM, Matt Weber m...@mattweber.org wrote: I also work with the FAST Enterprise Search engine and this is exactly how their Security Access Module works. They actually use a modified base-32 encoded value for indexing, but that is because they don't have the luxury of untokenized/un-processed String fields like Solr. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 12:26 PM, Terence Gannon wrote: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3) deniedUid (uid of users specifically denied access to the document). These fields, coupled with some business rules around how they were populated should cover off all possibilities I think. Access to the Solr instance would have to be tightly controlled, but that's something that should be done anyway. You sure wouldn't want end users preparing their own XML and throwing it at Solr -- it would be pretty easy to figure out how to get around the access/denied fields and get at stuff the owner didn't intend. This approach mimics to some degree what is being done in the operating system, but it's still elegant and provides the level of control required. Anybody else have any thoughts in this regard? Has anybody implemented anything similar, and if so, how did it work? Thanks, and best regards... Terence
RE: Selective Searches Based on User Identity
Thanks for the tip. I went to their website (www.fastsearch.com), and got as far as the second line, top left 'A Microsoft Subsidiary'...at which point, hopes of it being another open source solution quickly faded. ;-) Seriously, though, it looks like an interesting product, but open source is a mandatory requirement for my particular application. But the fact they implemented this functionality would seem to support that it's a valid requirement, and I'll keep plugging away on it. Thank you very much for bringing FAST to my attention...I appreciate it! Best regards... Terence -Original Message- From: Matt Weber [mailto:m...@mattweber.org] Sent: May 12, 2009 14:06 To: solr-user@lucene.apache.org Subject: Re: Selective Searches Based on User Identity I also work with the FAST Enterprise Search engine and this is exactly how their Security Access Module works. They actually use a modified base-32 encoded value for indexing, but that is because they don't have the luxury of untokenized/un-processed String fields like Solr. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com
Re: Selective Searches Based on User Identity
Here is a good presentation on search security from the Infonortics Search Conference that was held a few weeks ago. http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf The approach you are using is called early-binding. As Jay mentioned, one of the downsides is updating the documents each time you have an ACL change. You could use the late-binding approach that checks each result after the query but before you display to the user. I don't recommend this approach because it will strain your security infrastructure because you will need to check if the user can access each result. Good luck. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 1:21 PM, Jay Hill wrote: The only downside would be that you would have to update a document anytime a user was granted or denied access. You would have to query before the update to get the current values for grantedUID and deniedUID, remove/add values, and update the index. If you don't have a lot of changes in the system that wouldn't be a big deal, but if a lot of changes are happening throughout the day you might have to queue requests and batch them. -Jay On Tue, May 12, 2009 at 1:05 PM, Matt Weber m...@mattweber.org wrote: I also work with the FAST Enterprise Search engine and this is exactly how their Security Access Module works. They actually use a modified base-32 encoded value for indexing, but that is because they don't have the luxury of untokenized/un-processed String fields like Solr. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 12, 2009, at 12:26 PM, Terence Gannon wrote: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been granted access), and 3) deniedUid (uid of users specifically denied access to the document). These fields, coupled with some business rules around how they were populated should cover off all possibilities I think. Access to the Solr instance would have to be tightly controlled, but that's something that should be done anyway. You sure wouldn't want end users preparing their own XML and throwing it at Solr -- it would be pretty easy to figure out how to get around the access/denied fields and get at stuff the owner didn't intend. This approach mimics to some degree what is being done in the operating system, but it's still elegant and provides the level of control required. Anybody else have any thoughts in this regard? Has anybody implemented anything similar, and if so, how did it work? Thanks, and best regards... Terence
RE: Selective Searches Based on User Identity
In reply to both Matt and Jay's comments, the particular situation I'm dealing with is one where rights will change relatively little once they are established. Typically a document will be loaded and indexed, and a decision will be made on sharing that more-or-less immediately. It might change a couple of times after that, but that will be it. So early-binding seems like the better option. Thanks to both of you for your suggestions and help. Terence PS. I wish I had known about that conference...looks like it would have been very helpful to me right now! -Original Message- From: Matt Weber [mailto:m...@mattweber.org] Sent: May 12, 2009 14:41 To: solr-user@lucene.apache.org Subject: Re: Selective Searches Based on User Identity Here is a good presentation on search security from the Infonortics Search Conference that was held a few weeks ago. http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf The approach you are using is called early-binding. As Jay mentioned, one of the downsides is updating the documents each time you have an ACL change. You could use the late-binding approach that checks each result after the query but before you display to the user. I don't recommend this approach because it will strain your security infrastructure because you will need to check if the user can access each result. Good luck. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com
Re: Selective Searches Based on User Identity
Why can't you simply index a field authorized-to with value user-B and enrich any query you receive from a user with a mandatory query for that authorization? paul Le 11-mai-09 à 17:50, Terence Gannon a écrit : Can anybody point me in the direction of resources and/or projects regarding the following scenario; I have a community of users contributing content to a Solr index. By default, the user (A) who contributes a document owns it, and can see the document in their search results. The owner can then grant selective access to that document to other users. If another user (B) is granted access by A, then document shows up in B's search results, along with whatever B has contributed and any other documents to which B has been granted access. Conversely, if B is not granted access to the document, it does not show up in their search results. I'm comfortable building this logic myself, so long as I'm not repeating the work of others in this area. Thanks, in advance, for any advice or information. Terence smime.p7s Description: S/MIME cryptographic signature