I think the hard part here is taking a "raw" file like PDF bytes and
creating a record in a certain format. For now I think ScriptedReader
is your best bet, you can read the entire input stream in as a byte
array then return a Record that contains a "bytes" field containing
that data. You can creat
You may not need to merge if your Fetch Size is set appropriately. For
your case I don't recommend setting Max Rows Per Flow File because you
still have to wait for all the results to be processed before the
FlowFile(s) get sent "downstream". Also if you set Output Batch Size
you can't use Merge do
You can merge multiple Avro flow files with MergeRecord with an Avro Reader
and an Avro Writer
Le jeu. 4 janv. 2024 à 22:05, a écrit :
> And the important thing for us it has only one avro file by table.
>
> So it is possible to merge avro files to one avro file ?
>
> Regards
>
>
> *Envoyé:* jeu
And the important thing for us it has only one avro file by table.
So it is possible to merge avro files to one avro file ?
Regards
Envoyé: jeudi 4 janvier 2024 à 19:01
De: e-soci...@gmx.fr
À: users@nifi.apache.org
Cc: users@nifi.apache.org
Objet: Re: Hardware requirement for NIFI i
Hello all,
Thanks a lot for the reply.
So for more details.
All the properties for the ExecuteSQL are set by default, except "Set Auto Commit: false".
The sql command could not be more simple than "select * from ${db.table.fullname}"
The nifi version is 1.16.3 and 1.23.2
If I remember correctly, the default Fetch Size for Postgresql is to
get all the rows at once, which can certainly cause the problem.
Perhaps try setting Fetch Size to something like 1000 or so and see if
that alleviates the problem.
Regards,
Matt
On Thu, Jan 4, 2024 at 8:48 AM Etienne Jouvin wr
Minh,
As you are pulling data from the database, and you have no control of how many
rows there might be in your result set, I would look at paging the data out so
that you can control the maximum size of the result set held in memory.
Looking at the ExecuteSQL processor I would say that "Fetch
We've had this occur when executing complex queries and/or queries on large
tables in ExecuteSQL.
We typically try out some values of Max Rows Per Flow File and Fetch Size (both
to the same value) in the range of 1000, 10k, 50k, 100k to make it work without
memory issues. Changing the Output Ba
Hello.
I also think the problem is more about the processor, I guess ExecuteSQL.
Should play with batch configuration and commit flag to commit intermediate
FlowFile.
The out of memory exception makes me believe the full table is retrieved,
and if it is huge the FlowFile content is very large.
It should be memory efficient so I think this is likely a configuration
aspect of your processor. Can you share the configuration for all
properties?
As a side note: if NiFi ran out of memory, you'd always want to restart it
because you are never sure what's the state of the JVM after an OOME.
Le
Hello all,
Who could help me to determine the cpu/memory need for nifi instance to fetch the data from Postgresql hosted in google ?
We got this error :
==> Error : executesql.error.message
Ran out of memory retrieving query results.
The procesor ExecuteSQL has this config : Set Au
Hi Roman,
Embedded Zookeeper is really just for convenience for easily setting up a
cluster for dev/testing purposes. For production, you'd definitely want to
have an external Zookeeper, or use the ZK-less option if running NiFi on
Kubernetes. IMO embedded ZK should never be used for production.
Roman Wesołowski
13:18 (14 minutes ago)
to users
Hello,
I am looking for a recommendation for Nifi Cluster configuration (3 nodes).
I am wondering about the zookeeper configuration and whether we need to
have an external zookeeper for the Production Cluster. I don't see any
official information t
13 matches
Mail list logo