[ 
https://issues.apache.org/jira/browse/OAK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16040674#comment-16040674
 ] 

Thomas Mueller commented on OAK-2808:
-------------------------------------

I had a test failure today with the 1.7.1 release candidate, and can reproduce 
it locally (most of the time) on trunk:

{noformat}
[ERROR]   ActiveDeletedBlobCollectorTest.multiThreadedCommits:230 
Expected: iterable 
over ["Thread0Blob0-1", "Thread0Blob0-2", ..., "Thread3Blob499-1", 
"Thread3Blob499-2"] in any order
but: No item matches: 
"Thread2Blob497-1", "Thread2Blob497-2", "Thread2Blob498-1", "Thread2Blob498-2", 
"Thread2Blob499-1", "Thread2Blob499-2" 
in ["Thread1Blob0-1", "Thread1Blob0-2", ..., "Thread2Blob495-2", 
"Thread2Blob496-1", "Thread2Blob496-2"]
{noformat}

If I change the test slightly, then the error message is much shorter:

{noformat}
        HashSet<String> list = new HashSet<>(deletedChunks);
        list.removeAll(blobStore.deletedChunkIds);
        assertTrue(list.toString(), list.isEmpty());
        
        assertThat(blobStore.deletedChunkIds, 
containsInAnyOrder(deletedChunks.toArray()));

java.lang.AssertionError: [Thread0Blob499-1, Thread0Blob499-2, 
Thread0Blob498-2, Thread0Blob498-1]
java.lang.AssertionError: [Thread3Blob498-2, Thread3Blob498-1, 
Thread3Blob499-1, Thread3Blob499-2, Thread3Blob497-1, Thread3Blob497-2]
java.lang.AssertionError: [Thread3Blob498-2, Thread3Blob498-1, 
Thread3Blob499-1, Thread3Blob499-2, Thread3Blob496-1, Thread3Blob496-2, 
Thread3Blob497-1, Thread3Blob497-2]
{noformat}


> Active deletion of 'deleted' Lucene index files from DataStore without 
> relying on full scale Blob GC
> ----------------------------------------------------------------------------------------------------
>
>                 Key: OAK-2808
>                 URL: https://issues.apache.org/jira/browse/OAK-2808
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Vikas Saurabh
>              Labels: datastore, performance
>             Fix For: 1.8, 1.7.1
>
>         Attachments: copyonread-stats.png, OAK-2808-1.patch
>
>
> With storing of Lucene index files within DataStore our usage pattern
> of DataStore has changed between JR2 and Oak.
> With JR2 the writes were mostly application based i.e. if application
> stores a pdf/image file then that would be stored in DataStore. JR2 by
> default would not write stuff to DataStore. Further in deployment
> where large number of binary content is present then systems tend to
> share the DataStore to avoid duplication of storage. In such cases
> running Blob GC is a non trivial task as it involves a manual step and
> coordination across multiple deployments. Due to this systems tend to
> delay frequency of GC
> Now with Oak apart from application the Oak system itself *actively*
> uses the DataStore to store the index files for Lucene and there the
> churn might be much higher i.e. frequency of creation and deletion of
> index file is lot higher. This would accelerate the rate of garbage
> generation and thus put lot more pressure on the DataStore storage
> requirements.
> Discussion thread http://markmail.org/thread/iybd3eq2bh372zrl



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to