You can merge multiple Avro flow files with MergeRecord with an Avro Reader
and an Avro Writer

Le jeu. 4 janv. 2024 à 22:05, <[email protected]> a écrit :

> And the important thing for us it has only one avro file by table.
>
> So it is possible to merge avro files to one avro file ?
>
> Regards
>
>
> *Envoyé:* jeudi 4 janvier 2024 à 19:01
> *De:* [email protected]
> *À:* [email protected]
> *Cc:* [email protected]
> *Objet:* Re: Hardware requirement for NIFI instance
>
> Hello all,
>
> Thanks a lot for the reply.
>
> So for more details.
>
> All the properties for the ExecuteSQL are set by default, except "Set Auto
> Commit:  false".
>
> The sql command could not be more simple than "*select * from
> ${db.table.fullname}*"
>
> The nifi version is 1.16.3 and 1.23.2
>
> I have also test the same sql command in the another nifi (8 cores/ 16G
> Ram) and it is working.
> The result is the avro file with 1.6GB
>
> The detail about the output flowfile :
>
> executesql.query.duration
> 245118
> executesql.query.executiontime
> 64122
> executesql.query.fetchtime
> 180996
> executesql.resultset.index
> 0
> executesql.row.count
> 14961077
>
> File Size
> 1.62 GB
>
> Regards
>
> Minh
>
>
> *Envoyé:* jeudi 4 janvier 2024 à 17:18
> *De:* "Matt Burgess" <[email protected]>
> *À:* [email protected]
> *Objet:* Re: Hardware requirement for NIFI instance
> If I remember correctly, the default Fetch Size for Postgresql is to
> get all the rows at once, which can certainly cause the problem.
> Perhaps try setting Fetch Size to something like 1000 or so and see if
> that alleviates the problem.
>
> Regards,
> Matt
>
> On Thu, Jan 4, 2024 at 8:48 AM Etienne Jouvin <[email protected]>
> wrote:
> >
> > Hello.
> >
> > I also think the problem is more about the processor, I guess ExecuteSQL.
> >
> > Should play with batch configuration and commit flag to commit
> intermediate FlowFile.
> >
> > The out of memory exception makes me believe the full table is
> retrieved, and if it is huge the FlowFile content is very large.
> >
> >
> >
> >
> > Le jeu. 4 janv. 2024 à 14:37, Pierre Villard <
> [email protected]> a écrit :
> >>
> >> It should be memory efficient so I think this is likely a configuration
> aspect of your processor. Can you share the configuration for all
> properties?
> >> As a side note: if NiFi ran out of memory, you'd always want to restart
> it because you are never sure what's the state of the JVM after an OOME.
> >>
> >> Le jeu. 4 janv. 2024 à 17:26, <[email protected]> a écrit :
> >>>
> >>>
> >>> Hello all,
> >>>
> >>> Who could help me to determine the cpu/memory need for nifi instance
> to fetch the data from Postgresql hosted in google ?
> >>>
> >>> We got this error :
> >>> ==> Error : executesql.error.message
> >>> Ran out of memory retrieving query results.
> >>>
> >>> The procesor ExecuteSQL has this config : Set Auto Commit ==> false
> >>> driver Jar to use : postgresql-42.7.1.jar
> >>> Java version : jdk-11.0.19
> >>>
> >>> Table information :
> >>> rows number : 14958836
> >>> fields number : 20
> >>>
> >>> Linux Rocky8
> >>>
> >>> Architecture: x86_64
> >>> CPU op-mode(s): 32-bit, 64-bit
> >>> Byte Order: Little Endian
> >>> CPU(s): 2
> >>> On-line CPU(s) list: 0,1
> >>> Thread(s) per core: 2
> >>> Core(s) per socket: 1
> >>> Socket(s): 1
> >>> NUMA node(s): 1
> >>> Vendor ID: GenuineIntel
> >>> BIOS Vendor ID: Google
> >>> CPU family: 6
> >>> Model: 85
> >>> Model name: Intel(R) Xeon(R) CPU @ 2.80GHz
> >>> Stepping: 7
> >>> CPU MHz: 2800.286
> >>> BogoMIPS: 5600.57
> >>> Hypervisor vendor: KVM
> >>> Virtualization type: full
> >>> L1d cache: 32K
> >>> L1i cache: 32K
> >>> L2 cache: 1024K
> >>> L3 cache: 33792K
> >>> NUMA node0 CPU(s): 0,1
> >>>
> >>> Memory : 8GB
> >>>
> >>> Thanks for you helps
> >>>
> >>> Minh
>
>
>

Reply via email to