Ewen Cheslack-Postava commented on KAFKA-6551:

Seems reasonable – this should only be an issue if producing to the topic is 
failing and we generate a large backlog, but very good point that this should 
be bounded, at least roughly, and pause poll()ing until it is resolved. A bit 
hard to say what the right metric for measurement is since this holds onto the 
entire record. Maybe # of records will work in practice just because you can 
set it to a reasonable default and never think about it again while still not 
hitting any OOMs. But any large messages could make that assumption fail.

> Unbounded queues in WorkerSourceTask cause OutOfMemoryError
> -----------------------------------------------------------
>                 Key: KAFKA-6551
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6551
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>            Reporter: Gunnar Morling
>            Priority: Major
> A Debezium user reported an {{OutOfMemoryError}} to us, with over 50,000 
> messages in the {{WorkerSourceTask#outstandingMessages}} map.
> This map is unbounded and I can't see any way of "rate limiting" which would 
> control how many records are added to it. Growth can only indirectly be 
> limited by reducing the offset flush interval, but as connectors can return 
> large amounts of messages in single {{poll()}} calls that's not sufficient in 
> all cases. Note the user reported this issue during snapshotting a database, 
> i.e. a high number of records arrived in a very short period of time.
> To solve the problem I'd suggest to make this map backpressure-aware and thus 
> prevent its indefinite growth, so that no further records will be polled from 
> the connector until messages have been taken out of the map again.

This message was sent by Atlassian JIRA

Reply via email to