[ 
https://issues.apache.org/jira/browse/CAMEL-23120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Claus Ibsen updated CAMEL-23120:
--------------------------------
    Component/s: camel-docling

> camel-docling - Implement batchSize sub-batch partitioning in batch processing
> ------------------------------------------------------------------------------
>
>                 Key: CAMEL-23120
>                 URL: https://issues.apache.org/jira/browse/CAMEL-23120
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-docling
>            Reporter: Andrea Cosentino
>            Assignee: Andrea Cosentino
>            Priority: Major
>             Fix For: 4.18.1, 4.19.0
>
>
> The batchSize configuration parameter (default 10, defined in 
> DoclingConfiguration) is declared with @UriParam and read from exchange 
> headers in both processBatchConversion() and processBatchStructuredData(), 
> but the value is never actually applied in the batch processing logic.
> Both convertDocumentsBatch() and convertStructuredDataBatch() accept 
> batchSize as a method parameter but submit all documents to the thread pool 
> executor at once, regardless of the configured value. This makes batchSize a 
> no-op parameter, which is misleading for users who configure it expecting it 
> to control processing granularity.
> When processing large document sets (e.g., thousands of files), all documents 
> are submitted as CompletableFuture instances simultaneously, which can cause 
> excessive memory consumption and lack of back-pressure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to