[jira] Commented: (JCR-1214) DocId.UUIDDocId should not have a string attr uuid, but two long's lsb and msb

Ard Schrijvers (JIRA) Wed, 14 Nov 2007 01:27:04 -0800

    [ 
https://issues.apache.org/jira/browse/JCR-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542368
 ]


Ard Schrijvers commented on JCR-1214:
-------------------------------------

>no, I'm afraid there isn't, but it's definitively a good idea 

:-) Perhaps I can make one (though i am terrible at making pics), because I 
want to have it here at the office as well, for a common understanding of the 
jackrabbit indexing 

>> ... if it is ever possible that a parent of a node can be found in the 
>> parent index.

>that's not possible. the parent index contains the nodes under /jcr:system 
>(including the /jcr:system node). the opposite is possible, >though just for 
>one node, the mentioned jcr:system node. this one will have a UUIDDocId, which 
>references the root node of the >workspace.

That is good news, and makes it a little easier. Am thinking about a two step 
check, where first a reference to the entire MultiIndexReader  is checked. 

IF : check reference to the entire MultiIndexReader  instance is positive, 
return cached results.
ELSE IF :check the index reader segment instance the parent docnumber was in: 
if  instance present, recompute docNumber with respect to the new offsets in 
MultiIndexReader and return (almost) cached result.  
ELSE : recompute docNumber by search in MultiIndexReader  (the uncached case)

I will try to implement it during the weekend because the next days I am really 
occupied. Will share my findings and tests (and performance issues) hopefully 
on sunday. 

> DocId.UUIDDocId should not have a string attr uuid, but two long's lsb and 
> msb 
> -------------------------------------------------------------------------------
>
>                 Key: JCR-1214
>                 URL: https://issues.apache.org/jira/browse/JCR-1214
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 1.3.3
>            Reporter: Ard Schrijvers
>             Fix For: 1.4
>
>
> After JCR-1213 will be solved, lots of DocId.UUIDDocId can be cached, and not 
> being cleaned after every gc(). The number of cached UUIDDocId can grow very 
> large, depending on the size of the repository.  Therefor, instead of storing 
> the private String uuid; we can make it more memory efficient by storing 2 
> long's, the lsb and msb of the uuid.  Storing 1.000.000 of parent UUIDDocId 
> might differ about 100Mb of memory. 
> I even did test by removing the entire uuid string, and not use msb or lsb, 
> because, when everything works properly (with references to index reader 
> segments (See JCR-1213)), the uuid is never needed again: in 
> UUIDDocId getDocumentNumber(IndexReader reader) throws IOException {
> we could set uuid = null just before the return. It works perfectly well, 
> because when an index reader is recreated, the CachingIndexReader will be 
> recreated, hence DocId[] parents will be recreated. 
> So, IMO, I think we might be able to remove the uuid entirely when the 
> docNumber is found in DocId.UUIDDocId (obviously after JCR-1213)
> WDOT?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1214) DocId.UUIDDocId should not have a string attr uuid, but two long's lsb and msb

Reply via email to