Re: How to map the atlassian confluence security model to manifoldcf

Karl Wright Thu, 30 May 2013 03:18:57 -0700

Hi Markus,

Have you had any luck with this?


Karl



On Sun, May 26, 2013 at 9:32 AM, Karl Wright <[email protected]> wrote:

> Hi Markus,
>
> The usual way these things map is that there is an API call that gets a
> list of groups and users that can see
> the resource, and *maybe* there's a list of groups and users that are
> prohibited from seeing the resource.
> These user ids and group ids get used as access tokens.  The semantics of
> the ManifoldCF access tokens are that prohibitions supercede allowances.
> The authority service then simply returns the user id and a list of
> group ids to which the user belongs, provided such functionality exists in
> the API.
>
> In the case of Atlassian, where parents have both prohibition lists as
> well as allowance lists, it is usually the case that the prohibition lists
> can simply be unioned when they are flattened.  Being a member of any
> prohibited group in the hierarchy is sufficient to exclude a user from
> seeing the resource.  For allowance
> lists, however, it is not possible to merge the lists in a simple way,
> since as you point out you are trying to
> capture an "AND" relationship.  To make this concrete, say you have three
> objects - A->B->C, and let's say
> P(A) is the allow list for A, P(B) for B, etc.  Then, you want
> "user_in(P(A)) AND user_in(P(B)) AND user_in(P(C))".
>
> I agree that the only viable way to flatten this is to create an access
> token for every combination of group
> permissions you are likely to see.  So if there were the groups G1 G2 G3
> G4 and G5, there would have to be
> access tokens for "G1 AND G2", "G2 AND G3", "G1 AND G2 AND G3", etc.  The
> authority service would then be stuck returning a combinatorially large
> number of access tokens, and that would not do at all.
>
> An alternative is to try and find a way to implement the AND relationship
> between access tokens natively.
> To do it his way requires an open-ended and potentially combinatorially
> large number of index fields.  You'd
> need one such field per page, seems to me.  In theory Solr has a way of
> creating N fields at index time, where
> you just use a special field prefix, and the field is created.  But there
> are two problems with this.  First,
> at query time, the Lucene query the Solr plugin would need to build would
> contain a clause for every page in
> Atlassian.  That's not going to work.  Second, we'd need a default value
> for access tokens for all pages in
> Atlassian for every document indexed, and I don't think that's
> configurable in Solr either.
>
> Another alternative is to post-filter results.  This will require
> significant support in ManifoldCF, especially in the
> authority connector, but it could be added with not too much trouble.  The
> downside is that there are going to
> be cases where one would need to go through a lot of results to find the
> few that one is allowed to see.  I'm
> willing to do this, though, if there are no better alternatives.
>
> But there's one more possibility, which is worth thinking about.
> Specifically, try the approach of actually calculating the minimal
> user/group list for the document, at indexing time.  So the access tokens
> are group id's and user id's, and the connector logic actually calculates
> the minimal intersection of P(A), P(B), and P(C) in the example above.
>
> Example 1:
> P(A) was G1 or G2
> P(B) was G2 or G3
> P(C) was G4
>
> ...then the logic would explicitly find all users which matched ALL of
> those criteria - which would mean that the
> access token list for the document would be a list of individual user id's
> in this case, not groups - specifically the list of user ids of those users
> that belong to G2 AND G4.
>
> Example 2:
> P(A) was G1 or G2 or G3
> P(B) was G2 or G3
> P(C) was G3
>
> ...then the logic would return just the group id for G3.
>
> The only problem with this approach that I can see is that if the sysadmin
> structures things like example 1, the
> only way a user would be rendered unable to see such a document would be
> via reindexing.  Changing the user's group affinity alone would not be
> sufficient in that case.  However, I strongly suspect that real Atlassian
> sysadmins do things more like Example 2 than Example 1.  What do you think?
>
> Karl
>
>
>
> On Sat, May 25, 2013 at 8:20 PM, Markus Schuch <[email protected]>wrote:
>
>> Hi Karl,
>>
>> no need to apologize... a response in less than 24 hours to an open
>> source project's mailing list entry is perfect to me ;) - so thank you for
>> the quick response and thank you for sacrificing your valuable holiday
>> weekend time.
>>
>> The confluence API returns user and/or group names when requesting
>> permissions for a page.
>>
>> see:
>>
>> https://developer.atlassian.com/display/CONFDEV/Remote+Confluence+Methods#RemoteConfluenceMethods-Permissions.1
>>
>> https://developer.atlassian.com/display/CONFDEV/Remote+Confluence+Data+Objects#RemoteConfluenceDataObjects-contentpermissionContentPermission
>>
>> But the API methods for retrieving page permissions do not respect
>> permissions inherited from parent pages which is very sad. (refer to
>> https://jira.atlassian.com/browse/CONF-14965)
>>
>> To workaround this problem we will have to write a confluence plugin that
>> can give us the effective permissions for a page.
>> We looked into that and we think it is possible.
>> In theory the effective page permissions retrieved by our plugin would be
>> a list of group names and/or usernames. The groupnames have to be ANDed to
>> respect permissions inherited from parent pages. We can concatenate all
>> needed combinations of group and user names to single accesstokens to
>> create a "flattened" version of the permission hierarchy. So good so far...
>>
>> But another problem arises:
>> The authority connector would also have to return accesstokens that are
>> compatible to the flattened permission hierachy and therefore we must build
>> all possible permutations of the user's groupnames. If our math is correct,
>> there will be (2^n)-1 access tokens for a user (where n is the number of
>> distinct groups the user is member of). Additionally there will be more
>> combinations with the username. This will most probably not perform well
>> for users with many group memberships.
>>
>> I see these 2 options:
>> - We could implement folder level accesstokens for a constant number X of
>> folder levels.
>> So the outputconnector would need to reject documents with a number of
>> folder levels greater X.
>> May be there is built in limit of page levels in confluence... if not,
>> that this solution is not ideal.
>> - Start to think about post filtering...
>>
>> Regards,
>> Markus
>>
>> -----------------------------------------
>>
>> Gesendet: Samstag, 25. Mai 2013 um 16:54 Uhr
>> Von: "Karl Wright" <[email protected]>
>> An: "[email protected]" <[email protected]>
>> Betreff: Re: How to map the atlassian confluence security model to
>> manifoldcf
>>
>> Hi Marcus,
>>
>> Sorry for the slow response - it is a holiday weekend in the States, and
>> that has managed to impact me to some degree.
>>  Anyhow, I've looked at the doc on Atlassian security, and I have some
>> questions.  First, when you call the Atlassian API, and request security
>> information for a document, in what form does it come back?  If it comes
>> back as a minimal list of groups and users which can see the document, then
>> you probably just want the access tokens for this connector to be group
>> names/ids and user names/ids.  If it is more complicated, and basically you
>> have to ascend the hierarchy either explicitly or implicitly, then we'll
>> have to work a bit harder.  Either we'll have to find a flat mapping of
>> folders to access tokens, or we'll have to look at extending the framework
>> to handle more stuff.
>>
>> As far as the folder-level security, the reason it is deprecated at the
>> moment is because it is very challenging to implement properly in a
>> standard search engine with a fixed schema, since there are N possible
>> folder parents, where N is determined by an individual document.
>> Furthermore, the model is not really applicable to the case where there is
>> a hierarchy that cannot be flattened. But, depending on what the answer is
>> to my question above, if needed we can try to come up with a workable
>> folder implementation, and extend the Solr connector and plugins as well.
>>
>> Karl
>>
>>
>>
>> On Fri, May 24, 2013 at 6:57 PM, Markus Schuch <[email protected]>
>> wrote:Hi,
>>
>> we are currently writing a repository connector for confluence.
>> We are using the solr output connection on Solr 4.x.
>> Seeding, versioning, processing works already and now we have to face
>> security.
>>
>> Compared to the already supported repositories by mcf, confluence seems
>> to have a different security model.
>>
>> There are "Space" permissions for a whole wiki space and these can easily
>> be mapped as shareAllowTokens but there are also page restrictions. Page
>> restrictions are attached to each page (page = document) and page
>> restrictions are inherited.
>>
>> See "Example of Child Page Restrictions" in the Confluence Doc:
>>
>> https://confluence.atlassian.com/display/DOC/Page+Restrictions[https://confluence.atlassian.com/display/DOC/Page+Restrictions]<https://confluence.atlassian.com/display/DOC/Page+Restrictions%5Bhttps://confluence.atlassian.com/display/DOC/Page+Restrictions%5D>
>>
>> The inheritance of page restrictions makes things difficult.
>> If we are correct, than it is not sufficient to add the page restrictions
>> as document level access tokens, because the query time filtering handels
>> the user's access tokens (e.g. group memberships) as disjunction. Instead
>> we probalby need a hierarchic, folder based structure of access tokens to
>> map the inheritance of the page restrictions correctly.
>> The current Solr SearchComponent does not support folder level access
>> tokens and the book (mcf in action) says, that these kind of tokens are
>> considered deprecated.
>> To cut a long story short... we are stuck at the moment.
>>
>> Our questions:
>> Did anyone already manage to map confluence security to mcf/solr?
>> Or does somebody has an idea how a confluence-like security model can be
>> mapped to mcf/solr?
>>
>> Thanks in advance
>> Markus
>>
>
>

Re: How to map the atlassian confluence security model to manifoldcf

Reply via email to