[
https://issues.apache.org/jira/browse/IGNITE-20291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksandr Polovtcev updated IGNITE-20291:
-----------------------------------------
Description:
I was running a stress test that simply inserted a lot of rows using
{{RecordView#streamData}} method with the following options:
{code:java}
var options = DataStreamerOptions.builder()
.perNodeParallelOperations(1)
.batchSize(10_000)
.autoFlushFrequency(100_000)
.build();
{code}
After about 30 minutes I started getting Java OOM errors. Having inspected the
heap dump, I found that all memory has been consumed by the Tuple instances
that I have been inserting. The reason for that is the following:
# {{StreamerBuffer}} accumulates batches of data and then sends them to the
server. If the server is slow for any reason, batch sending is placed into a
"queue" by chaining {{flushFut}} one after another.
# {{StreamerSubscriber}} is the class that controls back-pressure. It uses two
variables for this purpose: {{pendingItemCount}} which represents that amount
of items, requested by the Subscriber, and {{inFlightItemCount}} which
represents the amount of items being sent to the server. However,
{{StreamerSubscriber}} does not know about the "queue" present in the
{{StreamerBuffer}} and it increases the {{inFlightItemCount}} only when a
particular flush future is being executed. This means that futures sitting in
the "queue" do not contribute to the overall {{inFlightItemCount}} and
{{StreamerSubscriber}} keeps requesting more items even though a lot of items
are simply stuck in a different place.
was:
I was running a stress test that simply inserted a lot of rows using
{{RecordView#streamData}} method with the following options:
{code:java}
var options = DataStreamerOptions.builder()
.perNodeParallelOperations(1)
.batchSize(10_000)
.autoFlushFrequency(100_000)
.build();
{code}
After about 30 minutes I started getting Java OOM errors. Having inspected the
heap dump, I found that all memory has been consumed by the Tuple instance that
I have been inserting. The reason for that is the following:
# {{StreamerBuffer}} accumulates batches of data and then sends them to the
server. If the server is slow for any reason, batch sending is placed into a
"queue" by chaining {{flushFut}} one after another.
# {{StreamerSubscriber}} is the class that controls back-pressure. It uses two
variables for this purpose: {{pendingItemCount}} which represents that amount
of items, requested by the Subscriber, and {{inFlightItemCount}} which
represents the amount of items being sent to the server. However,
{{StreamerSubscriber}} does not know about the "queue" present in the
{{StreamerBuffer}} and it increases the {{inFlightItemCount}} only when a
particular flush future is being executed. This means that futures sitting in
the "queue" do not contribute to the overall {{inFlightItemCount}} and
{{StreamerSubscriber}} keeps requesting more items even though a lot of items
are simply stuck in a different place.
> Possible memory leak in StreamerBuffer
> --------------------------------------
>
> Key: IGNITE-20291
> URL: https://issues.apache.org/jira/browse/IGNITE-20291
> Project: Ignite
> Issue Type: Bug
> Reporter: Aleksandr Polovtcev
> Assignee: Aleksandr Polovtcev
> Priority: Major
> Labels: ignite-3
>
> I was running a stress test that simply inserted a lot of rows using
> {{RecordView#streamData}} method with the following options:
> {code:java}
> var options = DataStreamerOptions.builder()
> .perNodeParallelOperations(1)
> .batchSize(10_000)
> .autoFlushFrequency(100_000)
> .build();
> {code}
> After about 30 minutes I started getting Java OOM errors. Having inspected
> the heap dump, I found that all memory has been consumed by the Tuple
> instances that I have been inserting. The reason for that is the following:
> # {{StreamerBuffer}} accumulates batches of data and then sends them to the
> server. If the server is slow for any reason, batch sending is placed into a
> "queue" by chaining {{flushFut}} one after another.
> # {{StreamerSubscriber}} is the class that controls back-pressure. It uses
> two variables for this purpose: {{pendingItemCount}} which represents that
> amount of items, requested by the Subscriber, and {{inFlightItemCount}} which
> represents the amount of items being sent to the server. However,
> {{StreamerSubscriber}} does not know about the "queue" present in the
> {{StreamerBuffer}} and it increases the {{inFlightItemCount}} only when a
> particular flush future is being executed. This means that futures sitting in
> the "queue" do not contribute to the overall {{inFlightItemCount}} and
> {{StreamerSubscriber}} keeps requesting more items even though a lot of items
> are simply stuck in a different place.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)