Jens,
Try ForkRecord [1] with "Mode" set to "Extract" and "Include Parent
Fields" set to "true", I think that does what you're looking to do.
Regards,
Matt
[1]
Nathan,
If you have multiple JSON messages in one flow file, is it in one large
array, or a top-level JSON object with an array inside? Also are you trying
to transform each message or the whole thing (i.e. do you need to know
about more than one message at a time)? If you have a top-level array
Matt Burgess wrote:
>
> Yes, that's a pretty common operation amongst NiFi developers. In
> conf/bootstrap.conf there's a section called Enable Remote Debugging
> and a commented-out line something like:
>
> java.arg.debug=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,ad
Yes, that's a pretty common operation amongst NiFi developers. In
conf/bootstrap.conf there's a section called Enable Remote Debugging
and a commented-out line something like:
java.arg.debug=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005
You can remove the comment from that
Although this is an "unnatural" use of Groovy (and a conversation much
better suited for the dev list :), it is possible to get at a map of
defined variables (key and value). This counts on particular
implementations of the API and that there is no SecurityManager
installed in the JVM so Groovy
This is probably better suited for the dev list (not sure if you're
subscribed but please do, BCC'ing users and moving to dev), but the
implementations (components and their NARs) are not designed to be
subclassed for custom extensions outside the codebase, can you
describe your use case (and
Harsha,
There are two NARs associated with Hive components,
nifi-hive-services-api-nar which has the Hive1_1ConnectionPool service
(actually an interface, but that's under the hood), and the
nifi-hive1_1-nar which has the processors that declare themselves as users
of that interface (and the
m again.
>
>
>
> Thank you,
> Harsha
>
> Sent from Outlook <http://aka.ms/weboutlook>
> --
> *From:* Matt Burgess
> *Sent:* Tuesday, July 7, 2020 7:05 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: Hive_1_1 Processors and C
Luca,
I'm guessing the issue is the same as the one in [1] but it just wasn't
fixed for FetchFTP. Please feel free to write an improvement Jira [2] to
add this to FetchFTP as well.
Regards,
Matt
[1] https://issues.apache.org/jira/browse/NIFI-4137
[2] https://issues.apache.org/jira/browse/NIFI
Mike,
I think you can use LookupRecord with a RestLookupService to do this.
If it's missing features or it otherwise doesn't work for your use
case, please let us know and/or write up whatever Jiras you feel are
appropriate.
Regards,
Matt
On Mon, Jun 29, 2020 at 4:56 PM Mike Thomsen wrote:
>
>
Although the error attribute can help as a workaround, counting on a
text value is probably not the best option (although it's pretty much
all we have for now). I wrote up NIFI-7524 [1] to add a "retry"
relationship to ExecuteSQL like we have for PutSQL and
PutDatabaseRecord. It would route things
Asmath,
InvokeHttp routes the original flowfile to a number of different
relationships based on things like the status code. For example if
you're looking for a 2xx code but want to retry on that for some
reason, you'd use the "Original" relationship. If you want a retryable
code (5xx) you can
Satish,
Can you provide some sample data that causes this issue?
Thanks,
Matt
On Wed, Dec 2, 2020 at 5:18 AM naga satish wrote:
>
> Hi all, In record readers(CSVreader) when schema strategy is set to
> InferSchema, sometimes it keeps on giving error. the error states that index
> of
Geoffrey,
Where are the two flowfiles coming from? This use case is often
handled in NiFi using LookupRecord with one of the LookupService
implementations (REST, RDBMS, CSV, etc.). We don't currently have a
mechanism (besides scripting) to do enrichment/lookups from flowfiles.
For your script,
David,
The documentation for the metrics is in the "help" section of the datapoint
definition, if you hit the REST endpoint you can see the descriptions, also
they are listed in code [1].
Regards,
Matt
[1]
Asmath,
GetFile doesn't take an input connection, but if the attribute is
going to contain a file to ingest, you can use FetchFile instead. To
get an attribute from a database, take a look at LookupAttribute with
a SimpleDatabaseLookupService. Depending on the query you were going
to execute, you
Eric,
I don't believe it's possible in NiFi per se, because you'd have to
set it via a property, and properties have unique and static names so
EL is not evaluated on them. However you can use Groovy with
ExecuteScript to do this, check out [1] under the recipe "Add an
attribute to a flow file".
Vibath,
What is the "Fetch Size" property set to? It looks like PostgreSQL
will load all results if Fetch Size is set to zero [1]. Try setting it
to 1 or something like that, whatever doesn't use too much memory
but doesn't slow down the performance too much.
Regards,
Matt
[1]
Dirk,
We could look at adding a FileWatcher or something to
InvokeScriptedProcessor, but I doubt we'd want to allow re-evaluating
the script on the fly, maybe we would just set a flag indicating a
change was detected and the next time the processor is started or the
script would be evaluated,
John,
It should be generating multiple queries with OFFSET, I tried to
reproduce in a unit test (using Derby not MySQL) and everything looked
fine. I ran it once with 3 rows and a partition size of 2 and got the
expected 2 flowfiles (one with 2 rows and one with 1). Then I added 6
rows and ran
Asmath,
Upsert in SQL Server (without NiFi) can be difficult, and even
error-prone if concurrency is needed [1]. I suspect that's why it
hasn't been attempted in PutDatabaseRecord (well, via the MSSQL
adapter(s)) as of yet. I haven't tried it without creating a procedure
so I'm not sure if the
Jens,
What is the Schema Access Strategy set to in your CSVReader? If "Infer
Schema" or "Use String Fields From Header", the setting of "Treat
First Line As Header" should be ignored as those two options require a
header be present anyway. If you know the schema ahead of time you
could set it in
There's an example template on the Example Dataflow Templates page [1]
called Retry_Count_Loop.xml [2], not sure what components it uses
though.
Regards,
Matt
[1] https://cwiki.apache.org/confluence/display/NIFI/Example+Dataflow+Templates
[2]
Geoffrey,
In general you won't need to create your own DataType objects, instead
you can use the RecordFieldType methods such as
RecordFieldType.ARRAY.getArrayDataType(DataType elementType, boolean
elementsNullable).getDataType(). So for an array of ints:
myRecordFields.add(new
Khaja,
There are two options in NiFi for incremental database fetch:
QueryDatabaseTable and GenerateTableFetch. The former is more often
used on a standalone NiFi cluster for single tables (as it does not
accept an incoming connection). It generates the SQL needed to do
incremental fetching, then
Geoffrey,
There are two main types of LookupService implementations used by
processors like LookupAttribute and LookupRecord, namely
LookupService and LookupService. The former does a
single lookup and uses the single returned key. LookupRecord is most
often used with LookupService
Geoffrey,
There's a really good blog by the man himself [1] :) I highly recommend the
official blog in general, lots of great posts and many are record-oriented
[2]
Regards,
Matt
[1] https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi
[2] https://blogs.apache.org/nifi/
On Wed,
Carlos,
>From the DBCP doc:
If maxIdle is set too low on heavily loaded systems it is possible you
will see connections being closed and almost immediately new
connections being opened. This is a result of the active threads
momentarily closing connections faster than they are opening them,
You might be running into NIFI-7379 [1] where the different Prometheus
components are writing to the same registries. If you upgrade to a
later version of NiFi you should see the correct data.
Regards,
Matt
[1] https://issues.apache.org/jira/browse/NIFI-7379
On Mon, Aug 9, 2021 at 11:35 AM
Tom,
Which implementation of the Provenance Repository are you using? If
not the VolatileProvenanceRepository, can you try that as a
workaround? Also are you using the 1.14.0 version of the C2 server?
Regards,
Matt
On Tue, Sep 21, 2021 at 3:45 PM Tomislav Novosel
wrote:
>
> Hi to all,
>
>
>
The approach in #1 is already present in a few Put processors like
PutHive3QL, the property is named "Rollback on Failure" and takes a
boolean value. The docs explain that if set to false, the flowfile is
routed to failure, and if true will throw an exception and rollback
the session. Check
Jim,
You can apply Chris's solution to ExecuteScript itself, as you can add
dynamic properties there and they will evaluate the EL for you. If you
set a dynamic property "myNode" to "${ip()}", you should be able to
use myNode.evaluateAttributeExpressions().getValue().
Regards,
Matt
On Mon, Jul
>
> On Wed, Jul 21, 2021 at 3:26 PM Matt Burgess wrote:
>>
>> Scott et al,
>>
>> There are a number of options for monitoring flows, including
>> backpressure and even backpressure prediction:
>>
>> 1) The REST API for metrics. As you point out, it
Scott et al,
There are a number of options for monitoring flows, including
backpressure and even backpressure prediction:
1) The REST API for metrics. As you point out, it's subject to the
same authz/authn as any other NiFi operation and doesn't sound like it
will work out for you.
2) The
Jim,
Are you doing the whole series of directories in one call to
onTrigger? If so you could keep getting the current time and if you
haven't switched directories then you could reissue the bulletin if
the elapsed time > 5 mins, then reset the variable to determine the
next elapsed time.
ReportingTask with
>> > "SELECT * FROM CONNECTION_STATUS WHERE isBackPressureEnabled = true" and
>> > the new LoggingRecordSink as you suggested. Everything is working
>> > flawlessly now. Thank you again!
>> >
>> > Scott
>> >
>&
Sven,
This is a recently discovered bug, I am still working on
characterizing the issue before writing a Jira to describe it.
NIFI-9169 [1] has the same cause but is a slightly different issue. So
far the issue seems to be with using update key(s) with "Quote
Identifiers" set to true. Setting it
Jeremy,
I can't reproduce this on the latest main branch (closest to 1.15.3).
What's weird about that error message is that it says 'mongo-uri' is a
sensitive property, but it is not. I set up a Parameter Context (PC)
with a non-sensitive parameter "muri", set the PC on the root Process
Group
When you said "fetchSize set low", I assume you mean non-zero, a zero
will fetch all the rows at once. How did you paginate your query with
ExecuteSQLRecord? I was going to suggest GenerateTableFetch in front
to paginate the queries for you, but it definitely seems like we
should be able to do or
Can you share your conf/state-management.xml contents?
On Mon, Aug 28, 2023 at 8:33 AM Williams, Van
wrote:
> There are files that are appearing in the /tmp folder on some of our NiFi
> Linux hosts. The files all begin with 'file', and they somewhat quickly
> fill up that folder (we have an
Richard,
I'll look into the logging stuff, definitely strange behavior. Are you
using ExecuteScript or InvokeScriptedProcessor? I noticed an
intermittent failure on InvokeScriptedProcessor in 1.20 [1] but maybe
it's present in more of the scripting stuff as they use the same
classes to handle the
If you have large FlowFiles and are trying to sample records from
each, you can use SampleRecord. It has Interval Sampling,
Probabilistic Sampling, and Reservoir Sampling strategies, and I have
a PR [1] up to add Range Sampling [2].
Regards,
Matt
[1] https://github.com/apache/nifi/pull/5878
[2]
It didn't display in NiFi until NIFI-10027 [1], which has recently
been merged. It will be in the upcoming 1.17.0 release (or perhaps a
1.16.3 if the current RC is not released).
Regards,
Matt
[1] https://issues.apache.org/jira/browse/NIFI-10027
On Mon, May 23, 2022 at 11:46 AM Ryan Hendrickson
Mike,
I recommend QueryDatabaseTableRecord with judicious choices for the
following properties:
Fetch Size: This should be tuned to return the most number of rows
without causing network issues such as timeouts. Can be set to the
same value as Max Rows Per Flow File ensuring one fetch per
MiNiFi is actually alive and well, we just moved it into the NiFi codebase.
We’re actively developing a Command-and-Control (C2) capability to remotely
update the flow on the agent for example.
You can configure MiNiFi agents for SSL over Site-to-Site in order to talk to
secure NiFi instances.
Benji,
I've built a custom framework NAR [1] that has additional logging to
identify which components, process groups, etc. are causing the issue.
If you'd like, please feel free to save off your existing framework
NAR and replace it with this, then share the relevant section of the
log (matching
Sergio,
Your email says the flowfiles each contain a record to insert, but
PutSQL takes a full SQL statement such as INSERT INTO tableName VALUES
('hello', 'world', 1). If you have a record of data rather than a SQL
statement, you can use PutDatabaseRecord for that instead. If you do
have SQL
Geoffrey,
The biggest problem with JSON columns across the board is that the
JDBC and java.sql.Types specs don't handle them natively, and NiFi
records don't recognize JSON as a particular type, we are only
interested in the overall datatype such as String since NiFi records
can be in any
Thanks Vijay! I agree those processors should do the trick but there
were things in the transformation between input and desired output
that I wasn't sure of their origin. If you are setting constants you
can use either a Shift or Default spec, if you are moving fields
around you can use a Shift
Michał,
There are some options in NiFi that should work for you, we have
LookupRecord [1] which does simple enrichment based on a common field
(such as id) that acts like a SQL left join, there are a series of
articles [2] to explain in more detail. For more complex use cases
there are
Michael,
For the authorization use case, I recommend against using the
reporting task and instead using the built in endpoint for metrics
(see https://issues.apache.org/jira/browse/NIFI-7273 for more
details). The NiFi REST API (to include that endpoint) is subject to
the authentication and
Did you upgrade the version of Java when you upgraded NiFi? Later versions of
Java don’t include the Nashorn (ECMAScript) library, but I believe we added it
explicitly, perhaps for the 1.20 release (not at computer right now)
Sent from my iPhone
> On Jan 16, 2023, at 6:28 PM, Vijay Chhipa
Stored procedures that take no output parameters and return ResultSets should
work fine with ExecuteSQL, but for DBs that allow OUT and INOUT parameters,
those won’t make it into the outgoing FlowFile (in either content or
attributes).
Regards,
Matt
> On Feb 27, 2023, at 4:19 PM, Dmitry
utSQL
>
> CALL MYPROCEDURE.PROC1('N', ?,?,?,?)
>
> and I need to supply sql arg attributes... like...
>
> sql.args.1.type = 1
> sql.args.1.value = not sure what to put here
> sql.args.2.type = 4
> sql.args.2.value = not sure what to put here
> etc...
>
> Am I on t
I'm resurrecting the PR to add a property to specify a file location
for the Jolt spec. In the meantime you could maintain the spec as the
value of a context variable and point to it in the Jolt Spec property.
Then you can share amongst the nodes and maintain the spec in one
place.
Regards,
Matt
support as well, feel free to file a Jira
for that improvement if it will help.
- Matt
On Tue, Apr 18, 2023 at 6:16 PM Matt Burgess wrote:
>
> Jim,
>
> QueryRecord uses Apache Calcite under the hood and is thus at the
> mercy of the SQL standard (and any additional rules/dialect from
&
Jim,
QueryRecord uses Apache Calcite under the hood and is thus at the
mercy of the SQL standard (and any additional rules/dialect from
Apache Calcite) so in general you can't select "all except X" or "all
except change X to Y". Does it need to be SQL executed against the
individual fields? If
ach in Groovy to grab line N and avoid loading the entire CSV file into string variable
text?
On Thu, Feb 9, 2023 at 7:18 PM Matt Burgess <mattyb...@gmail.com> wrote:
I’m AFK ATM but Range Sampling was added into the SampleRecord processor (https://issues.apache.org/jira/browse/NIFI-9814),
I’m AFK ATM but Range Sampling was added into the SampleRecord processor
(https://issues.apache.org/jira/browse/NIFI-9814), the Jira doesn’t say which
version it went into but it is definitely in 1.19.1+. If that’s available to
you then you can just specify “2” as the range and it will only
a transaction? Is it possible with
> PutDatabaseRecord?
>
> Em sex., 10 de fev. de 2023 às 14:29, Matt Burgess
> escreveu:
>>
>> I agree with Chris here about using PutDatabaseRecord instead of the
>> Split and PutSQL. PutDatabaseRecord will process all records in a
&
I agree with Chris here about using PutDatabaseRecord instead of the
Split and PutSQL. PutDatabaseRecord will process all records in a
FlowFile as a transaction, so in PutDatabaseRecord you can set an
AvroReader (to read the records coming out of ExecuteSQL) and the
statement type (such as INSERT)
Jim,
I tried to use Jolt for this but I found in the doc that if you try to
set an empty array or map to null or the empty string it will retain
the empty array or map (no idea why). Since you know the name of the
fields (and I assume want to keep the schema intact) you can use
Kyrindor,
Can you provide an example of the kind of mapping you want to do? It
sounds like UpdateRecord [1] should work to change input fields to
output fields. For joining we offer "enrichment" and "lookup"
components that can add fields based on some other field value(s).
Regards,
Matt
[1]
destination
>
> On Wed, Jun 21, 2023 at 11:36 AM Matt Burgess wrote:
>>
>> Kyrindor,
>>
>> Can you provide an example of the kind of mapping you want to do? It
>> sounds like UpdateRecord [1] should work to change input fields to
>> output fields. For jo
Jim,
You can find out in Github [1] or from your installation you can do
(substituting your NiFi version in the NAR name):
jar -tvf lib/nifi-scripting-nar-1.16.3.nar | grep groovy
Regards,
Matt
[1]
Etienne,
What instance name / id are you referring to?
On Tue, Feb 13, 2024 at 8:43 AM Etienne Jouvin
wrote:
> Hello all.
>
> Just simple question, is there a way to access, from controller service,
> to the instance name / id event in not cluster implementation?
>
> Regards
>
> Etienne Jouvin
It looks like we need to call release() from some place(s) where we
don't currently. HBase had the same issues [1]. We use this in the
StandardMapCacheServer and ListenBeats, are you using either, neither,
or both?
Regards,
Matt
[1] https://github.com/netty/netty/issues/12549
On Tue, Dec 26,
Jim,
When you say you want to "avoid having to output them to a temp
directory", does that include the content repo? If not you can use
UnpackContent with a Packaging Type of zip. I tried on both JARs and
NARs and it works.
Regards,
Matt
On Sun, Dec 31, 2023 at 12:37 PM James McMahon wrote:
>
If I remember correctly, the default Fetch Size for Postgresql is to
get all the rows at once, which can certainly cause the problem.
Perhaps try setting Fetch Size to something like 1000 or so and see if
that alleviates the problem.
Regards,
Matt
On Thu, Jan 4, 2024 at 8:48 AM Etienne Jouvin
il about the output flowfile :
>>
>> executesql.query.duration
>> 245118
>> executesql.query.executiontime
>> 64122
>> executesql.query.fetchtime
>> 180996
>> executesql.resultset.index
>> 0
>> executesql.row.count
>> 14961077
>>
I think the hard part here is taking a "raw" file like PDF bytes and
creating a record in a certain format. For now I think ScriptedReader
is your best bet, you can read the entire input stream in as a byte
array then return a Record that contains a "bytes" field containing
that data. You can
Indeed it looks like someone else has run into this:
https://stackoverflow.com/questions/77615582/apache-nifi-2-x-org-eclipse-jetty-http-badmessageexception-400-invalid-sni
On Wed, Dec 6, 2023 at 10:05 PM Adam Taft wrote:
> David,
>
> Any chance that the Jetty SNI related information could
Specifically set Fetch Size to something like 1000, by default setting
Fetch Size to zero will cause Postgres to fetch the entire ResultSet into
memory [1]. We should probably change that default to avoid problems like
this and with other drivers (for example, Oracle's default is 10 rows which
is
For completeness, this can also affect the QueryDatabaseTable processors
[1]. This will be fixed in the next release(s).
Regards,
Matt
[1] https://issues.apache.org/jira/browse/NIFI-1931
On Tue, Mar 19, 2024 at 10:56 AM wrote:
> Hello Joe,
>
> Thanks for the response.
> But I found the
401 - 474 of 474 matches
Mail list logo