A first and foremost question to ask is are you asking for server level 
security on this sharing or are you happy with application level?
If you want or need server level security ( that is, if someone were to access 
the ML server directly using their credentials and start issuing queries could 
they gain access to docs they shouldn't) then the only way I know of to do this 
right is using the server supplied role based security. It is *hard code baked 
in* ... you s
imply cannot break it ... you can't even tell the existence of documents which 
you do not have access.   Its also extremely efficient on query because its 
done very deeply in the server.But it comes at the "price" of using the built 
in security measures, mainly the price of having to touch every document that 
has its role changed or the set of collections changed.
This is not a bad thing.  Its a great thing, but it does limit your choices and 
there is a performance hit.  (how much ? as with most things "it depends")

If, on the other hand, you physically restrict access to the ML server to your 
app only, and you are confident in *your code* ... then there are other options.

One I have been thinking about lately is the use of ML7 semantics features.  
This is a very lightweight way of storing lists of things,
it could for example store associations between users and the documents they 
can view.   Similar to storing this data in an XML file(s) ... but
much faster for some use cases because of the way its indexed and you dont have 
to change the target documents to change the list of who can see them - unlike 
changing
what collections or roles a document has.   It does require doing a 2 phase 
query though.  The first query to list the set of documents a user is allowed 
to see, then a second query
given that list as a constraint onto a search.

I
-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]
Phone: +1 812-482-5224
Cell:  +1 812-630-7622
www.marklogic.com<http://www.marklogic.com/>


From: [email protected] 
[mailto:[email protected]] On Behalf Of Timothy W. Cook
Sent: Wednesday, December 11, 2013 9:44 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Document Level Authorization (Roles and 
Users)



On Wed, Dec 11, 2013 at 11:41 AM, David Lee 
<[email protected]<mailto:[email protected]>> wrote:
Harry, how many users have you tried with this scheme ?
I am myself considering something for a demo app but not sure if it scales to 
thousands or hundreds of thousands or millions of users.


This is my concern also.  I need to scale to millions of users.  However, each 
user will likely have less than one hundred other users to share documents with.

There is also the issue that if you want to share a large set of documents to a 
new user (say 10,000 docs) then those 10,000 docs need to be "touched' (e.g. 
read and written),
this could be a heavy operation.


This is a scalability issue I would like to see if someone has experience with. 
 I could easily have a user with 10,000 or more documents.  What is the 
performance like when a new share is created across all of them?


The alternative, which is not as elegant but might perform better is to keep 
access lists as data (say in an XML file or files) and handle the access 
control at the user level.
You are right this is not as clean nor proven as using the system level access 
control but it might be
* faster
* easier


This seems to be a brittle approach.  Though it may be the best?


Another option might be to store the access list of a document in document 
properties.   You still have to touch the same number of files but potentially 
smaller changes
(assuming the access list is smaller then the document) and you can do property 
based searches combined with document searches so no "joining" required.

This approach also crossed my mind because in relative terms, my access list 
will be small.

I think this would make a great paper or blog

"How to handle access control of large numbers of users and documents"

Good idea.  Now we just need to do the research.  :-)


One thing I am not certain of yet.  What are the security and performance 
implications of using keywords in a document and then through a query provide 
visibility (to the UI) to only some of the documents? IOW: a user might have 
read access to documents in a collection, but not knowing that they exist and 
not having any access to the collection except via the UI.  Security through 
obscurity kind of rings out that idea though.  THoguhts?

--Tim


--
MLHIM VIP Signup: http://goo.gl/22B0U
============================================
Timothy Cook, MSc           +55 21 94711995
MLHIM http://www.mlhim.org
Like Us on FB: https://www.facebook.com/mlhim2
Circle us on G+: http://goo.gl/44EV5
Google Scholar: http://goo.gl/MMZ1o
LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to