Re: Pattern advice - files on disk into a record field

2024-01-04 Thread Matt Burgess
I think the hard part here is taking a "raw" file like PDF bytes and creating a record in a certain format. For now I think ScriptedReader is your best bet, you can read the entire input stream in as a byte array then return a Record that contains a "bytes" field containing that data. You can creat

Re: Hardware requirement for NIFI instance

2024-01-04 Thread Matt Burgess
You may not need to merge if your Fetch Size is set appropriately. For your case I don't recommend setting Max Rows Per Flow File because you still have to wait for all the results to be processed before the FlowFile(s) get sent "downstream". Also if you set Output Batch Size you can't use Merge do

Re: Hardware requirement for NIFI instance

2024-01-04 Thread Pierre Villard
You can merge multiple Avro flow files with MergeRecord with an Avro Reader and an Avro Writer Le jeu. 4 janv. 2024 à 22:05, a écrit : > And the important thing for us it has only one avro file by table. > > So it is possible to merge avro files to one avro file ? > > Regards > > > *Envoyé:* jeu

Re: Hardware requirement for NIFI instance

2024-01-04 Thread e-sociaux
And the important thing for us it has only one avro file by table.    So it is possible to merge avro files to one avro file ?   Regards      Envoyé: jeudi 4 janvier 2024 à 19:01 De: e-soci...@gmx.fr À: users@nifi.apache.org Cc: users@nifi.apache.org Objet: Re: Hardware requirement for NIFI i

Re: Hardware requirement for NIFI instance

2024-01-04 Thread e-sociaux
  Hello all,   Thanks a lot for the reply.   So for more details.   All the properties for the ExecuteSQL are set by default, except "Set Auto Commit:  false".   The sql command could not be more simple than "select * from ${db.table.fullname}"   The nifi version is 1.16.3 and 1.23.2  

Re: Hardware requirement for NIFI instance

2024-01-04 Thread Matt Burgess
If I remember correctly, the default Fetch Size for Postgresql is to get all the rows at once, which can certainly cause the problem. Perhaps try setting Fetch Size to something like 1000 or so and see if that alleviates the problem. Regards, Matt On Thu, Jan 4, 2024 at 8:48 AM Etienne Jouvin wr

RE: Hardware requirement for NIFI instance

2024-01-04 Thread stephen.hindmarch.bt.com via users
Minh, As you are pulling data from the database, and you have no control of how many rows there might be in your result set, I would look at paging the data out so that you can control the maximum size of the result set held in memory. Looking at the ExecuteSQL processor I would say that "Fetch

RE: Hardware requirement for NIFI instance

2024-01-04 Thread Isha Lamboo
We've had this occur when executing complex queries and/or queries on large tables in ExecuteSQL. We typically try out some values of Max Rows Per Flow File and Fetch Size (both to the same value) in the range of 1000, 10k, 50k, 100k to make it work without memory issues. Changing the Output Ba

Re: Hardware requirement for NIFI instance

2024-01-04 Thread Etienne Jouvin
Hello. I also think the problem is more about the processor, I guess ExecuteSQL. Should play with batch configuration and commit flag to commit intermediate FlowFile. The out of memory exception makes me believe the full table is retrieved, and if it is huge the FlowFile content is very large.

Re: Hardware requirement for NIFI instance

2024-01-04 Thread Pierre Villard
It should be memory efficient so I think this is likely a configuration aspect of your processor. Can you share the configuration for all properties? As a side note: if NiFi ran out of memory, you'd always want to restart it because you are never sure what's the state of the JVM after an OOME. Le

Hardware requirement for NIFI instance

2024-01-04 Thread e-sociaux
  Hello all,   Who could help me to determine the cpu/memory need for nifi instance to fetch the data from Postgresql hosted in google ?   We got this error : ==> Error : executesql.error.message Ran out of memory retrieving query results.   The procesor ExecuteSQL has this config : Set Au

Re: Embeeded zookeeper or External zookeeper for Nifi Cluster.

2024-01-04 Thread Pierre Villard
Hi Roman, Embedded Zookeeper is really just for convenience for easily setting up a cluster for dev/testing purposes. For production, you'd definitely want to have an external Zookeeper, or use the ZK-less option if running NiFi on Kubernetes. IMO embedded ZK should never be used for production.

Embeeded zookeeper or External zookeeper for Nifi Cluster.

2024-01-04 Thread Roman Wesołowski
Roman Wesołowski 13:18 (14 minutes ago) to users Hello, I am looking for a recommendation for Nifi Cluster configuration (3 nodes). I am wondering about the zookeeper configuration and whether we need to have an external zookeeper for the Production Cluster. I don't see any official information t