[
https://issues.apache.org/jira/browse/OAK-11158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nuno Santos resolved OAK-11158.
-------------------------------
Fix Version/s: 1.72.0
Resolution: Done
> indexing-job/downloader - Move the conversion of Mongo responses to
> NodeDocument from the download to the transform threads
> ---------------------------------------------------------------------------------------------------------------------------
>
> Key: OAK-11158
> URL: https://issues.apache.org/jira/browse/OAK-11158
> Project: Jackrabbit Oak
> Issue Type: Task
> Components: indexing
> Reporter: Nuno Santos
> Priority: Major
> Fix For: 1.72.0
>
>
> Currently, the download thread is iterating over the response receive from
> Mongo by converting the response to NodeDocument instances. This is a fairly
> expensive operation, that can account for more than 50% of the time of the
> download threads. While the download thread is processing the answer, it is
> blocked from requesting more data from Mongo, which is often the bottleneck.
> We can instead convert the Mongo documents to a RawBsonDocument, which is
> just a copy of the binary buffer representing a Mongo document. This is a
> very fast operation, as it requires only making a copy the binary buffer. We
> can then pass these RawBsonDocuments to the transform threads, which will
> then convert them to NodeDocument.
> This moves the heavy work of parsing the answer away from the download
> threads, which should significantly improve the download speed as the
> download threads will take less time to process each Mongo response and will
> more quickly send the next request. To deal with the extra load of the
> transform threads, we can increase their number, which currently is set to 2.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)