You can merge multiple Avro flow files with MergeRecord with an Avro Reader and an Avro Writer
Le jeu. 4 janv. 2024 à 22:05, <[email protected]> a écrit : > And the important thing for us it has only one avro file by table. > > So it is possible to merge avro files to one avro file ? > > Regards > > > *Envoyé:* jeudi 4 janvier 2024 à 19:01 > *De:* [email protected] > *À:* [email protected] > *Cc:* [email protected] > *Objet:* Re: Hardware requirement for NIFI instance > > Hello all, > > Thanks a lot for the reply. > > So for more details. > > All the properties for the ExecuteSQL are set by default, except "Set Auto > Commit: false". > > The sql command could not be more simple than "*select * from > ${db.table.fullname}*" > > The nifi version is 1.16.3 and 1.23.2 > > I have also test the same sql command in the another nifi (8 cores/ 16G > Ram) and it is working. > The result is the avro file with 1.6GB > > The detail about the output flowfile : > > executesql.query.duration > 245118 > executesql.query.executiontime > 64122 > executesql.query.fetchtime > 180996 > executesql.resultset.index > 0 > executesql.row.count > 14961077 > > File Size > 1.62 GB > > Regards > > Minh > > > *Envoyé:* jeudi 4 janvier 2024 à 17:18 > *De:* "Matt Burgess" <[email protected]> > *À:* [email protected] > *Objet:* Re: Hardware requirement for NIFI instance > If I remember correctly, the default Fetch Size for Postgresql is to > get all the rows at once, which can certainly cause the problem. > Perhaps try setting Fetch Size to something like 1000 or so and see if > that alleviates the problem. > > Regards, > Matt > > On Thu, Jan 4, 2024 at 8:48 AM Etienne Jouvin <[email protected]> > wrote: > > > > Hello. > > > > I also think the problem is more about the processor, I guess ExecuteSQL. > > > > Should play with batch configuration and commit flag to commit > intermediate FlowFile. > > > > The out of memory exception makes me believe the full table is > retrieved, and if it is huge the FlowFile content is very large. > > > > > > > > > > Le jeu. 4 janv. 2024 à 14:37, Pierre Villard < > [email protected]> a écrit : > >> > >> It should be memory efficient so I think this is likely a configuration > aspect of your processor. Can you share the configuration for all > properties? > >> As a side note: if NiFi ran out of memory, you'd always want to restart > it because you are never sure what's the state of the JVM after an OOME. > >> > >> Le jeu. 4 janv. 2024 à 17:26, <[email protected]> a écrit : > >>> > >>> > >>> Hello all, > >>> > >>> Who could help me to determine the cpu/memory need for nifi instance > to fetch the data from Postgresql hosted in google ? > >>> > >>> We got this error : > >>> ==> Error : executesql.error.message > >>> Ran out of memory retrieving query results. > >>> > >>> The procesor ExecuteSQL has this config : Set Auto Commit ==> false > >>> driver Jar to use : postgresql-42.7.1.jar > >>> Java version : jdk-11.0.19 > >>> > >>> Table information : > >>> rows number : 14958836 > >>> fields number : 20 > >>> > >>> Linux Rocky8 > >>> > >>> Architecture: x86_64 > >>> CPU op-mode(s): 32-bit, 64-bit > >>> Byte Order: Little Endian > >>> CPU(s): 2 > >>> On-line CPU(s) list: 0,1 > >>> Thread(s) per core: 2 > >>> Core(s) per socket: 1 > >>> Socket(s): 1 > >>> NUMA node(s): 1 > >>> Vendor ID: GenuineIntel > >>> BIOS Vendor ID: Google > >>> CPU family: 6 > >>> Model: 85 > >>> Model name: Intel(R) Xeon(R) CPU @ 2.80GHz > >>> Stepping: 7 > >>> CPU MHz: 2800.286 > >>> BogoMIPS: 5600.57 > >>> Hypervisor vendor: KVM > >>> Virtualization type: full > >>> L1d cache: 32K > >>> L1i cache: 32K > >>> L2 cache: 1024K > >>> L3 cache: 33792K > >>> NUMA node0 CPU(s): 0,1 > >>> > >>> Memory : 8GB > >>> > >>> Thanks for you helps > >>> > >>> Minh > > >
