[jira] Commented: (JCR-2760) Use hash codes instead of sequence numbers for string indexes

JIRA Wed, 06 Oct 2010 02:26:58 -0700

    [ 
https://issues.apache.org/jira/browse/JCR-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918447#action_12918447
 ]


Sébastien Launay commented on JCR-2760:
---------------------------------------

These changes are interesting for low-level copies of workspace content across 
repositories but also for clustering because it was possible to have collision 
between indexes for the same namespace on two nodes. 

I agree with Thomas that a different solution should be considered in JR 3  for 
sharing namespaces (maybe also for nodetypes) especially for cluster nodes (see 
JCR-1558).


> Use hash codes instead of sequence numbers for string indexes
> -------------------------------------------------------------
>
>                 Key: JCR-2760
>                 URL: https://issues.apache.org/jira/browse/JCR-2760
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 2.2.0
>
>         Attachments: 
> 0001-JCR-2760-Use-hash-codes-instead-of-sequence-numbers.patch
>
>
> We use index numbers instead of namespace URIs or other strings in many 
> places. The two-way mapping between namespace URIs and index numbers is by 
> default stored in the repository-global ns_idx.properties file, and the index 
> numbers are allocated using a linear sequence. The problem with this approach 
> is that two repositories will easily end up with different string index 
> mappings, which makes it practically impossible to make low-level copies of 
> workspace content across repositories.
> The ultimate solution for this problem would be to store the namespace URIs 
> closer to the stored content, ideally as an implementation detail of a 
> persistence manager.
> An easier short-term solution would be to decrease the chances of two 
> repositories having different string index mappings. A simple (and 
> backwards-compatible) way to do this is to use the hash code of a namespace 
> URI as the basis of allocating a new index number. Hash collisions are fairly 
> unlikely, and can be handled by incrementing the intial hash code until the 
> collision is avoided. In the common case of no collisions (with a uniform 
> hash function the chance of a collision is less than 1% even with tousands of 
> registered namespaces) this solution allows workspaces to be copied between 
> repositories without worrying about the namespace index mappings.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-2760) Use hash codes instead of sequence numbers for string indexes

Reply via email to