Re: Selective Searches Based on User Identity

2009-05-13 Thread Michael Ludwig

Terence Gannon schrieb:

Paul -- thanks for the reply, I appreciate it.  That's a very
practical approach, and is worth taking a closer look at.  Actually,
taking your idea one step further, perhaps three fields; 1) ownerUid
(uid of the document's owner) 2) grantedUid (uid of users who have
been granted access), and 3) deniedUid (uid of users specifically
denied access to the document).


Grants might change quite a bit, the owner will likely remain the same.

Wouldn't it be better to include only the owner in the document and
store grants someplace else, like in an RDBMS or - if you don't want
one - a lightweight embedded database like BDB?

That way you could have your application tag an ineluctable filter query
onto each and every user query, which would ensure to include only those
documents in the results the owner of which has granted the user access.

Considering that I'm a Solr/Lucene newbie, this approach might have a
disadvantage that escapes me, which is why other people haven't made
this particular suggestion. If so, I'd be happy to learn why this isn't
preferable.

Michael Ludwig


RE: Selective Searches Based on User Identity

2009-05-13 Thread Terence Gannon
Yes, the ownerUid will likely be assigned once and never changed.  But
you still need it, in order to keep track of who has contributed which
document.

I've been going over some of the simpler query scenarios, and Solr is
capable of handling them without having to resort to an external
RDBMS.  In order to limit documents to those which a given user owns,
or those to which he has been granted access, the syntax fragment
would be something like;

ownerUid:ab2734 or grantedUid:ab2734

where abs2734 is the uid for the user doing the query.  However, I'm
less comfortable with more complex query scenarios, particularly if
the concept of groups is eventually introduced, which is likely in my
scenario.
In the latter case, it may be necessary to use an external RDBMS.
I'll plead ignorance of the 'ineluctable filter query' and will have
to read up on that one.

With respect to updates to rights, they are not likely to be that
frequent, but when they are, they entire document will have to be
reindexed rather than simply updating the grantedUid and/or deniedUid
fields.  I don't believe Solr supports the updating of individual
fields, at least not yet.  This may be another reason to eventually go
to an external RDBMS.

Thanks very much for your help!

Terence

-Original Message-
From: Michael Ludwig
Sent: May 13, 2009 05:27
To: solr-user@lucene.apache.org
Subject: Re: Selective Searches Based on User Identity

Terence Gannon schrieb:
 Paul -- thanks for the reply, I appreciate it.  That's a very
 practical approach, and is worth taking a closer look at.  Actually,
 taking your idea one step further, perhaps three fields; 1) ownerUid
 (uid of the document's owner) 2) grantedUid (uid of users who have
 been granted access), and 3) deniedUid (uid of users specifically
 denied access to the document).

Grants might change quite a bit, the owner will likely remain the same.

Wouldn't it be better to include only the owner in the document and
store grants someplace else, like in an RDBMS or - if you don't want
one - a lightweight embedded database like BDB?

That way you could have your application tag an ineluctable filter query
onto each and every user query, which would ensure to include only those
documents in the results the owner of which has granted the user access.

Considering that I'm a Solr/Lucene newbie, this approach might have a
disadvantage that escapes me, which is why other people haven't made
this particular suggestion. If so, I'd be happy to learn why this isn't
preferable.

Michael Ludwig


Re: Selective Searches Based on User Identity

2009-05-13 Thread Michael Ludwig

Hi Terence,

Terence Gannon schrieb:

Yes, the ownerUid will likely be assigned once and never changed.  But
you still need it, in order to keep track of who has contributed which
document.


Yes, of course!


I've been going over some of the simpler query scenarios, and Solr is
capable of handling them without having to resort to an external
RDBMS.


The database is only to store grants - it's not to help with searching.
It would look like this:

  grantee| grant
  ---+--
  fritz  | fred,frank,egon
  frank  | egon,fritz
  egon   | terence,frank
  ...

Each user is granted to access to his own documents and to those he
had received grants for.


In order to limit documents to those which a given user owns,
or those to which he has been granted access, the syntax fragment
would be something like;

ownerUid:ab2734 or grantedUid:ab2734


I think it could be:

  ownerUid:egon OR ownerUid:terence OR ownerUid:frank

No need to embed grants in the document.

Ah, I see my mistake now. You want grants based on the document, not on
the user - I had overlooked that fact. That makes my suggestion invalid.


I'll plead ignorance of the 'ineluctable filter query' and will have
to read up on that one.


I meant a filter query that the application tags onto the query on
behalf of the user and without the user being able to do anything about
it so he cannot circumvent the filter.

Best regards,

Michael Ludwig


RE: Selective Searches Based on User Identity

2009-05-12 Thread Terence Gannon
Paul -- thanks for the reply, I appreciate it.  That's a very practical
approach, and is worth taking a closer look at.  Actually, taking your idea
one step further, perhaps three fields; 1) ownerUid (uid of the document's
owner) 2) grantedUid (uid of users who have been granted access), and 3)
deniedUid (uid of users specifically denied access to the document).  These
fields, coupled with some business rules around how they were populated
should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but that's
something that should be done anyway.  You sure wouldn't want end users
preparing their own XML and throwing it at Solr -- it would be pretty easy
to figure out how to get around the access/denied fields and get at stuff
the owner didn't intend.

This approach mimics to some degree what is being done in the operating
system, but it's still elegant and provides the level of control required.
 Anybody else have any thoughts in this regard?  Has anybody implemented
anything similar, and if so, how did it work?  Thanks, and best regards...

Terence


Re: Selective Searches Based on User Identity

2009-05-12 Thread Matt Weber
I also work with the FAST Enterprise Search engine and this is exactly  
how their Security Access Module works.  They actually use a modified  
base-32 encoded value for indexing, but that is because they don't  
have the luxury of untokenized/un-processed String fields like Solr.


Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 12:26 PM, Terence Gannon wrote:

Paul -- thanks for the reply, I appreciate it.  That's a very  
practical
approach, and is worth taking a closer look at.  Actually, taking  
your idea
one step further, perhaps three fields; 1) ownerUid (uid of the  
document's
owner) 2) grantedUid (uid of users who have been granted access),  
and 3)
deniedUid (uid of users specifically denied access to the  
document).  These
fields, coupled with some business rules around how they were  
populated

should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but  
that's
something that should be done anyway.  You sure wouldn't want end  
users
preparing their own XML and throwing it at Solr -- it would be  
pretty easy
to figure out how to get around the access/denied fields and get at  
stuff

the owner didn't intend.

This approach mimics to some degree what is being done in the  
operating
system, but it's still elegant and provides the level of control  
required.
Anybody else have any thoughts in this regard?  Has anybody  
implemented
anything similar, and if so, how did it work?  Thanks, and best  
regards...


Terence




Re: Selective Searches Based on User Identity

2009-05-12 Thread Jay Hill
The only downside would be that you would have to update a document anytime
a user was granted or denied access. You would have to query before the
update to get the current values for grantedUID and deniedUID, remove/add
values, and update the index. If you don't have a lot of changes in the
system that wouldn't be a big deal, but if a lot of changes are happening
throughout the day you might have to queue requests and batch them.

-Jay

On Tue, May 12, 2009 at 1:05 PM, Matt Weber m...@mattweber.org wrote:

 I also work with the FAST Enterprise Search engine and this is exactly how
 their Security Access Module works.  They actually use a modified base-32
 encoded value for indexing, but that is because they don't have the luxury
 of untokenized/un-processed String fields like Solr.

 Thanks,

 Matt Weber
 eSr Technologies
 http://www.esr-technologies.com





 On May 12, 2009, at 12:26 PM, Terence Gannon wrote:

  Paul -- thanks for the reply, I appreciate it.  That's a very practical
 approach, and is worth taking a closer look at.  Actually, taking your
 idea
 one step further, perhaps three fields; 1) ownerUid (uid of the document's
 owner) 2) grantedUid (uid of users who have been granted access), and 3)
 deniedUid (uid of users specifically denied access to the document).
  These
 fields, coupled with some business rules around how they were populated
 should cover off all possibilities I think.

 Access to the Solr instance would have to be tightly controlled, but
 that's
 something that should be done anyway.  You sure wouldn't want end users
 preparing their own XML and throwing it at Solr -- it would be pretty easy
 to figure out how to get around the access/denied fields and get at stuff
 the owner didn't intend.

 This approach mimics to some degree what is being done in the operating
 system, but it's still elegant and provides the level of control required.
 Anybody else have any thoughts in this regard?  Has anybody implemented
 anything similar, and if so, how did it work?  Thanks, and best regards...

 Terence





RE: Selective Searches Based on User Identity

2009-05-12 Thread Terence Gannon
Thanks for the tip.  I went to their website (www.fastsearch.com), and got
as far as the second line, top left 'A Microsoft Subsidiary'...at which
point, hopes of it being another open source solution quickly faded. ;-)
Seriously, though, it looks like an interesting product, but open source is
a mandatory requirement for my particular application.  But the fact they
implemented this functionality would seem to support that it's a valid
requirement, and I'll keep plugging away on it.  Thank you very much for
bringing FAST to my attention...I appreciate it!  Best regards...

Terence



-Original Message-
From: Matt Weber [mailto:m...@mattweber.org]
Sent: May 12, 2009 14:06
To: solr-user@lucene.apache.org
Subject: Re: Selective Searches Based on User Identity



I also work with the FAST Enterprise Search engine and this is exactly

how their Security Access Module works.  They actually use a modified

base-32 encoded value for indexing, but that is because they don't

have the luxury of untokenized/un-processed String fields like Solr.



Thanks,



Matt Weber

eSr Technologies

http://www.esr-technologies.com


Re: Selective Searches Based on User Identity

2009-05-12 Thread Matt Weber
Here is a good presentation on search security from the Infonortics  
Search Conference that was held a few weeks ago.


http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf

The approach you are using is called early-binding.  As Jay mentioned,  
one of the downsides is updating the documents each time you have an  
ACL change.  You could use the late-binding approach that checks each  
result after the query but before you display to the user.  I don't  
recommend this approach because it will strain your security  
infrastructure because you will need to check if the user can access  
each result.


Good luck.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com




On May 12, 2009, at 1:21 PM, Jay Hill wrote:

The only downside would be that you would have to update a document  
anytime
a user was granted or denied access. You would have to query before  
the
update to get the current values for grantedUID and deniedUID,  
remove/add
values, and update the index. If you don't have a lot of changes in  
the
system that wouldn't be a big deal, but if a lot of changes are  
happening

throughout the day you might have to queue requests and batch them.

-Jay

On Tue, May 12, 2009 at 1:05 PM, Matt Weber m...@mattweber.org  
wrote:


I also work with the FAST Enterprise Search engine and this is  
exactly how
their Security Access Module works.  They actually use a modified  
base-32
encoded value for indexing, but that is because they don't have the  
luxury

of untokenized/un-processed String fields like Solr.

Thanks,

Matt Weber
eSr Technologies
http://www.esr-technologies.com





On May 12, 2009, at 12:26 PM, Terence Gannon wrote:

Paul -- thanks for the reply, I appreciate it.  That's a very  
practical
approach, and is worth taking a closer look at.  Actually, taking  
your

idea
one step further, perhaps three fields; 1) ownerUid (uid of the  
document's
owner) 2) grantedUid (uid of users who have been granted access),  
and 3)

deniedUid (uid of users specifically denied access to the document).
These
fields, coupled with some business rules around how they were  
populated

should cover off all possibilities I think.

Access to the Solr instance would have to be tightly controlled, but
that's
something that should be done anyway.  You sure wouldn't want end  
users
preparing their own XML and throwing it at Solr -- it would be  
pretty easy
to figure out how to get around the access/denied fields and get  
at stuff

the owner didn't intend.

This approach mimics to some degree what is being done in the  
operating
system, but it's still elegant and provides the level of control  
required.
Anybody else have any thoughts in this regard?  Has anybody  
implemented
anything similar, and if so, how did it work?  Thanks, and best  
regards...


Terence








RE: Selective Searches Based on User Identity

2009-05-12 Thread Terence Gannon
In reply to both Matt and Jay's comments, the particular situation I'm
dealing with is one where rights will change relatively little once
they are established.  Typically a document will be loaded and
indexed, and a decision will be made on sharing that more-or-less
immediately.  It might change a couple of times after that, but that
will be it.  So early-binding seems like the better option.  Thanks to
both of you for your suggestions and help.

Terence

PS. I wish I had known about that conference...looks like it would
have been very helpful to me right now!

-Original Message-
From: Matt Weber [mailto:m...@mattweber.org]
Sent: May 12, 2009 14:41
To: solr-user@lucene.apache.org
Subject: Re: Selective Searches Based on User Identity



Here is a good presentation on search security from the Infonortics

Search Conference that was held a few weeks ago.



http://www.infonortics.com/searchengines/sh09/slides/kehoe.pdf



The approach you are using is called early-binding.  As Jay mentioned,

one of the downsides is updating the documents each time you have an

ACL change.  You could use the late-binding approach that checks each

result after the query but before you display to the user.  I don't

recommend this approach because it will strain your security

infrastructure because you will need to check if the user can access

each result.



Good luck.



Thanks,



Matt Weber

eSr Technologies

http://www.esr-technologies.com


Re: Selective Searches Based on User Identity

2009-05-11 Thread Paul Libbrecht
Why can't you simply  index a field authorized-to with value user-B  
and enrich any query you receive from a user with a mandatory query  
for that authorization?


paul


Le 11-mai-09 à 17:50, Terence Gannon a écrit :

Can anybody point me in the direction of resources and/or projects  
regarding
the following scenario; I have a community of users contributing  
content to
a Solr index.  By default, the user (A) who contributes a document  
owns it,
and can see the document in their search results.  The owner can  
then grant
selective access to that document to other users.  If another user  
(B) is
granted access by A, then document shows up in B's search results,  
along
with whatever B has contributed and any other documents to which B  
has been
granted access.  Conversely, if B is not granted access to the  
document, it

does not show up in their search results.

I'm comfortable building this logic myself, so long as I'm not  
repeating the

work of others in this area.  Thanks, in advance, for any advice or
information.

Terence




smime.p7s
Description: S/MIME cryptographic signature