[jira] [Updated] (OAK-11158) indexing-job/downloader - Move the conversion of Mongo responses to NodeDocument from the download to the transform threads

Nuno Santos (Jira) Wed, 23 Oct 2024 04:50:37 -0700


     [ 
https://issues.apache.org/jira/browse/OAK-11158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Nuno Santos updated OAK-11158:
------------------------------
    Issue Type: Task  (was: Bug)

> indexing-job/downloader - Move the conversion of Mongo responses to 
> NodeDocument from the download to the transform threads
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: OAK-11158
>                 URL: https://issues.apache.org/jira/browse/OAK-11158
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: indexing
>            Reporter: Nuno Santos
>            Priority: Major
>
> Currently, the download thread is iterating over the response receive from 
> Mongo by converting the response to NodeDocument instances. This is a fairly 
> expensive operation, that can account for more than 50% of the time of the 
> download threads. While the download thread is processing the answer, it is 
> blocked from requesting more data from Mongo, which is often the bottleneck.
> We can instead convert the Mongo documents to a RawBsonDocument, which is 
> just a copy of the binary buffer representing a Mongo document. This is a 
> very fast operation, as it requires only making a copy the binary buffer. We 
> can then pass these RawBsonDocuments to the transform threads, which will 
> then convert them to NodeDocument. 
> This moves the heavy work of parsing the answer away from the download 
> threads, which should significantly improve the download speed as the 
> download threads will take less time to process each Mongo response and will 
> more quickly send the next request. To deal with the extra load of the 
> transform threads, we can increase their number, which currently is set to 2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OAK-11158) indexing-job/downloader - Move the conversion of Mongo responses to NodeDocument from the download to the transform threads

Reply via email to