[jira] [Commented] (OAK-4430) DataStoreBlobStore#getAllChunkIds fetches DataRecord when not needed

Amit Jain (JIRA) Tue, 07 Jun 2016 00:50:51 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15318072#comment-15318072
 ]


Amit Jain commented on OAK-4430:
--------------------------------

Just for the record had an offline chat with [~chetanm] yesterday and he 
suggested that since, S3 already returns the size object when doing the listing 
it's better we create a new overloaded method which return the DataRecord. This 
will help save the additional call to S3 and also not leak out abstraction 
related to length encoding in the blob ids.

> DataStoreBlobStore#getAllChunkIds fetches DataRecord when not needed
> --------------------------------------------------------------------
>
>                 Key: OAK-4430
>                 URL: https://issues.apache.org/jira/browse/OAK-4430
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: blob
>            Reporter: Amit Jain
>            Assignee: Amit Jain
>              Labels: candidate_oak_1_2, candidate_oak_1_4
>             Fix For: 1.6
>
>
> DataStoreBlobStore#getAllChunkIds loads the DataRecord for checking that the 
> lastModifiedTime criteria is satisfied against the given 
> {{maxLastModifiedTime}}. 
> When the {{maxLastModifiedTime}} has a value 0 it  effectively means ignore 
> any last modified time check (and which is the only usage currently from 
> MarkSweepGarbageCollector). This should ignore fetching the DataRecords as 
> this can be very expensive for e.g on calls to S3 with millions of blobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OAK-4430) DataStoreBlobStore#getAllChunkIds fetches DataRecord when not needed

Reply via email to