Hi Vladimir,

I think this is potentially an issue but I don't think this is about PDS at all.

The description is a bit vague, I have to say. AFAIU what you see is that when 
the caches are persistent the streamer writes data faster than the nodes 
(especially, backup nodes) process the writes.
Therefore, the nodes accumulate the writes in the queues, the queues grow, and 
then you might go OOM.

The solution to just have lesser queues when there is persistent (and therefore 
it's more likely the queues will reach the max size) is not the best one, in my 
opinion.
If the default max queue size is too large, it should be less always, 
regardless of why the queues grow.

Furthermore, I have a feeling that what gives you OOM isn't the data streamer 
queue... AFAIR your data streamer queue size is something like (entrySize * 
bufferSize * perNodeParallelOperations),
which for 1 kb entries and 16 threads gives (1kb * 512 * 16 * 8) = 64mb which 
is usually peanuts for server Java.

Can you check the heap dump in your tests to see what actually occupies most of 
the heap? 

Thanks,
Stan

> On 28 Oct 2022, at 11:54, Vladimir Steshin <vlads...@gmail.com> wrote:
> 
>     Hi Folks,
> 
>     I found that Datastreamer may consume heap or use increased heap amount 
> when loading into a persistent cache.
> This may happen with streamer's 'allowOverwite'==true and the cache is in 
> PRIMARY_SYNC mode.
> 
>     What I don't like here is that the case looks simple. Not the defaults, 
> but user might meet the issue just in a trival test, trying/researching the 
> streamer.
> 
>     Streamer has related 'perNodeParallelOperations()' which helps. But 
> addinional DFLT_PARALLEL_PERSISTENT_OPS_MULTIPLIER might be set for PDS.
> 
>     My question are:
> 1) Is it an issue at all? Need to fix? A minor?
> 2) Should we bring additional default DFLT_PARALLEL_PERSISTENT_OPS_MULTIPLIER 
> for PDS because it reduces heap consumption?
> 3) Better solution is backpressure. But does it worth the case?
> 
> Ticket: https://issues.apache.org/jira/browse/IGNITE-17735
> PR: https://github.com/apache/ignite/pull/10343

Reply via email to