[ 
https://issues.apache.org/jira/browse/OAK-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angela updated OAK-3933:
------------------------
    Description: 
Membership information of a group currently is stored as an unsorted set of 
IDs, spread over multiple child nodes, using multivalued properties (of 1000 
each).

This content structure makes certain operations relatively slow:
affected methods are

- Group.isMember(Authorizable)
- Group.isDeclaredMember(Authorizable)
- Group.addMember(Authorizable)
- Group.addMembers(String...)
1) Checking for declared membership

When the authorizable to be checked is not a member, all child nodes need to be 
read and examined (in the other case, checking stops when a match is found).

2) Checking for inherited membership

The membership IDs do not reveal the type of authorizable. In order to check 
inherited membership as well, the authorizable with the given ID needs to be 
read from storage in order to check the type.


Below are a few ideas how this might be improved (however, the change of 
structure would require a mgiration step).

1) Avoid having to read all child nodes to check declared membership

Assuming an alphanumeric ID structure, this could be achieved my modifying the 
structure like that:

- as before, start with a single node

- when a new member needs to be inserted and the candidate node is already full 
(has 1000 entries), create a new child node named after the first character of 
the authorizable ID

- when this "level 1" member is full, start using "level 2" members and so on

(assuming the ID structure is suitable for that, otherwise a different hash 
could be used)

To check for membership, we wouldn't need to read *all* child nodes, but only 
those where the node name is a prefix match of the ID.


2) Avoid having to instantiate authorizables for declared membership checks

- put limited type information into the stored IDs, such as "u" and "g" 
prefixes; that way the code could identify authorizables that are users and 
avoid having to instantiate them

(this assumes that an ID that refers to a user will never refer to a group in 
the future)
 

  was:
Membership information of a group currently is stored as an unsorted set of 
IDs, spread over multiple child nodes, using multivalued properties (of 1000 
each).

This content structure makes certain operations relatively slow:

1) Checking for declared membership

When the authorizable to be checked is not a member, all child nodes need to be 
read and examined (in the other case, checking stops when a match is found).

2) Checking for inherited membership

The membership IDs do not reveal the type of authorizable. In order to check 
inherited membership as well, the authorizable with the given ID needs to be 
read from storage in order to check the type.


Below are a few ideas how this might be improved (however, the change of 
structure would require a mgiration step).

1) Avoid having to read all child nodes to check declared membership

Assuming an alphanumeric ID structure, this could be achieved my modifying the 
structure like that:

- as before, start with a single node

- when a new member needs to be inserted and the candidate node is already full 
(has 1000 entries), create a new child node named after the first character of 
the authorizable ID

- when this "level 1" member is full, start using "level 2" members and so on

(assuming the ID structure is suitable for that, otherwise a different hash 
could be used)

To check for membership, we wouldn't need to read *all* child nodes, but only 
those where the node name is a prefix match of the ID.


2) Avoid having to instantiate authorizables for declared membership checks

- put limited type information into the stored IDs, such as "u" and "g" 
prefixes; that way the code could identify authorizables that are users and 
avoid having to instantiate them

(this assumes that an ID that refers to a user will never refer to a group in 
the future)
 


> potential improvements to membership management
> -----------------------------------------------
>
>                 Key: OAK-3933
>                 URL: https://issues.apache.org/jira/browse/OAK-3933
>             Project: Jackrabbit Oak
>          Issue Type: Epic
>          Components: core
>            Reporter: Julian Reschke
>            Assignee: angela
>
> Membership information of a group currently is stored as an unsorted set of 
> IDs, spread over multiple child nodes, using multivalued properties (of 1000 
> each).
> This content structure makes certain operations relatively slow:
> affected methods are
> - Group.isMember(Authorizable)
> - Group.isDeclaredMember(Authorizable)
> - Group.addMember(Authorizable)
> - Group.addMembers(String...)
> 1) Checking for declared membership
> When the authorizable to be checked is not a member, all child nodes need to 
> be read and examined (in the other case, checking stops when a match is 
> found).
> 2) Checking for inherited membership
> The membership IDs do not reveal the type of authorizable. In order to check 
> inherited membership as well, the authorizable with the given ID needs to be 
> read from storage in order to check the type.
> Below are a few ideas how this might be improved (however, the change of 
> structure would require a mgiration step).
> 1) Avoid having to read all child nodes to check declared membership
> Assuming an alphanumeric ID structure, this could be achieved my modifying 
> the structure like that:
> - as before, start with a single node
> - when a new member needs to be inserted and the candidate node is already 
> full (has 1000 entries), create a new child node named after the first 
> character of the authorizable ID
> - when this "level 1" member is full, start using "level 2" members and so on
> (assuming the ID structure is suitable for that, otherwise a different hash 
> could be used)
> To check for membership, we wouldn't need to read *all* child nodes, but only 
> those where the node name is a prefix match of the ID.
> 2) Avoid having to instantiate authorizables for declared membership checks
> - put limited type information into the stored IDs, such as "u" and "g" 
> prefixes; that way the code could identify authorizables that are users and 
> avoid having to instantiate them
> (this assumes that an ID that refers to a user will never refer to a group in 
> the future)
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to