I think using UpdateRecord would also cause OOME here. The issue here is that 
we have one really huge JSON. That is 1 record, not 10 million records. 
Fortunately, we do have a way deal with this, though. In the JsonTreeReader, 
you can set the "Starting Field Strategy” property to "Nested Field” and set 
the "Starting Field Name” property to “records”. I think this should give you 
what you want, using a “streaming json parser” to skip the fields before 
“records” so that it doesn’t load the entire thing into memory.

Thanks
-Mark


> On Feb 7, 2025, at 5:55 AM, Pierre Villard <pierre.villard...@gmail.com> 
> wrote:
> 
> You could use UpdateRecord with record path and have a dynamic
> property: / => /records
> 
> Le ven. 7 févr. 2025 à 11:51, <e-soci...@gmx.fr> a écrit :
>> 
>> 
>> Hello all,
>> 
>> NIFI receives a flowfile containing a bit json around 10.000.000 records
>> 
>> {
>>  "state" : "SUCCEEDED",
>>  "progress" : 100,
>>  "result" : {
>>    "records" : [ {
>>      "timestamp" : "2025-02-07T08:59:59.999000000+01:00",
>>      "data": "fooA",
>>    },{
>>      "timestamp" : "2025-02-07T08:59:33.999000000+01:00",
>>      "data": "fooB",
>>    },.................
>> ............
>> .......
>> ]
>> }
>> 
>> How do you do to get only the records and to avoid "out of memory" when we 
>> use EvaluateJsonPath ?
>> 
>> Thanks for your help
>> 
>> Minh

Reply via email to