Re: How work with Big JSON

e-sociaux Mon, 10 Feb 2025 04:43:19 -0800

Thanks for reply.

I try it

Envoyé: vendredi 7 février 2025 à 15:43
De: "Mark Payne" <[email protected]>
À: "[email protected]" <[email protected]>
Objet: Re: How work with Big JSON

I think using UpdateRecord would also cause OOME here. The issue here is that we have one really huge JSON. That is 1 record, not 10 million records. Fortunately, we do have a way deal with this, though. In the JsonTreeReader, you can set the "Starting Field Strategy” property to "Nested Field” and set the "Starting Field Name” property to “records”. I think this should give you what you want, using a “streaming json parser” to skip the fields before “records” so that it doesn’t load the entire thing into memory.

Thanks
-Mark

> On Feb 7, 2025, at 5:55 AM, Pierre Villard <[email protected]> wrote:
>
> You could use UpdateRecord with record path and have a dynamic
> property: / => /records
>
> Le ven. 7 févr. 2025 à 11:51, <[email protected]> a écrit :
>>
>>
>> Hello all,
>>
>> NIFI receives a flowfile containing a bit json around 10.000.000 records
>>
>> {
>> "state" : "SUCCEEDED",
>> "progress" : 100,
>> "result" : {
>> "records" : [ {
>> "timestamp" : "2025-02-07T08:59:59.999000000+01:00",
>> "data": "fooA",
>> },{
>> "timestamp" : "2025-02-07T08:59:33.999000000+01:00",
>> "data": "fooB",
>> },.................
>> ............
>> .......
>> ]
>> }
>>
>> How do you do to get only the records and to avoid "out of memory" when we use EvaluateJsonPath ?
>>
>> Thanks for your help
>>
>> Minh

Re: How work with Big JSON

Reply via email to