OK so the decompress should be CPU intensive but not heap/memory intensive.
EvaluateJsonPath will potentially consume large amounts of heap as well, 
depending on how it’s configured.
The ExecuteGroovyScript sounds like it would use very little.
ReplaceText may well consume huge amounts of heap, depending on how it’s 
configured.

Can you share how EvaluteJsonPath and ReplaceText are configured?

The idea that 16 GB of RAM is max recommended for a JVM was true a while ago 
but with modern JVM’s you can go much higher. That said, given the flow 
described, 4 GB should be more than sufficient if properly configured.

Thanks
-Mark


On Nov 6, 2024, at 9:51 AM, e-soci...@gmx.fr wrote:

Thanks for reply Mark,

The groovy script is very simple :

        hexContent = flowFile.getAttribute('hexContent')
        hexContent = hexContent.decodeHex()
        outputStream.write(hexContent)

The question is how is possible to process flowfiles as quickly as possible.
If I upgrade the CPU to 8 per node, is it possible to process less flowfiles at 
the same time but more flowfiles ?

The main nifi dataflow is :

  *   Uncompress incoming flowfiles (cpu/heap consume I suppose)
  *   ReplaceText (heap consume)
  *   EvaluateJsonPath (heap consume)
  *   ExecuteGroovyScript (heap consume)

I read that 16GB of RAM is the maximum recommended for a JVM and that adding 
more isn’t beneficial.
Is that true, or can I increase it to 32GB?

Regards

Minh
Envoyé: mercredi 6 novembre 2024 à 15:24
De: "Mark Payne" <marka...@hotmail.com>
À: "users@nifi.apache.org" <users@nifi.apache.org>
Objet: Re: Caused by: java.lang.OutOfMemoryError: Java heap space
Hi Minh,

It is possible that the heap is being exhausted by EvaluateJsonPath if you are 
using it to add large JSON chunks as attributes. For example, if you’re 
creating an attribute from `$.` to put the entire JSON contents into 
attributes. Generally, attributes should be kept pretty small.

Otherwise, based on the flow described, the issue is almost certainly within 
the ExecuteGroovyScript. There, there’s not much guidance we can provide, as 
it’s running your own script. You’d need to understand what in your own script 
is using up all of the heap.

Thanks
-Mark


On Nov 6, 2024, at 4:26 AM, e-soci...@gmx.fr wrote:

Hello all,

We got a cluster with 10 nodes (4CPU/16Go) - NIFI 1.25 - jdk-11.0.19

We use this cluster to send the datas to GCP bucket, the datas are sent by 
others clusters, so we do S2S betweens them.

I can't determine where is the issue. This message could by raise by 
EvaluateJsonPath/ExecuteGroovyScript/UpdateAttribute
We have around 100.000 flowfiles (160Go datas)
We need configure more than 1 tasks for each processor to run more faster but 
we have always this error


Reply via email to