Mike,
You can reference ${fileSize}. This is the size of a FlowFile in bytes.
Thanks
-Mark
On May 17, 2019, at 1:27 PM, Mike Thomsen
mailto:mikerthom...@gmail.com>> wrote:
Is there a way to get access to the flowfile size from EL? This said to use
Content-Length, but that didn't appear to
operty
descriptors that are scoped for attributes)
Andy LoPresto
alopre...@apache.org<mailto:alopre...@apache.org>
alopresto.apa...@gmail.com<mailto:alopresto.apa...@gmail.com>
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
On May 16, 2019, at 18:21, Mark Payne
mailto:ma
We could probably handle this without the user interface changing. The
processor could be smart enough to determine the proper ordering to evaluate
the properties based on whether or not it references another property. For
example if properties are:
first.name = John
last.name = Doe
greeting =
Wolfgang,
So the general pattern that you're running into here is that you want to ingest
data from Source 1
(a database), then filter/route the data based on a lookup in a second
system/dataset (the API).
So, for this I would recommend that you look at the LookupRecord processor.
So the flow,
John,
It sounds like you've got this working, which is great, so thanks for following
up and sharing
the transformation that you used. I'm curious, though, if you looked into using
PartitionRecord
instead of the Split -> JsonPath -> Merge path. PartitionRecord should make
that simpler and
will
William,
You should be able to perform this using the Wait/Notify pattern. But I think a
simpler (and likely better performing) alternative would be
to avoid splitting out into two directions at all and instead use a linear flow:
ListFile -> AttributesToJson -> ReplaceText -> InvokeHttp ->
Ryan,
In later versions of NiFi (1.8.0 I believe) the validation is done in the
background instead of in the web request,
so stopped processors won't slow down the UI. We also do allow the user to
select 1 or more components and
disable/enable the components in bulk. Older versions did not
RouteOnContent may be a good solution. ScanContent is probably a more efficient
recommendation, though.
ReplaceText would probably also work well, but you don't want to use Evaluation
Mode of Entire Text - you're buffering the content
of the entire FlowFile into memory and running a regex over
Thanks for reporting this, Viking! I have created a JIRA [1] to address this.
I expect to have a PR up for it shortly.
Thanks
-Mark
[1] https://issues.apache.org/jira/browse/NIFI-6110
On Mar 8, 2019, at 4:37 AM, Viking K
mailto:cyber_v...@hotmail.com>> wrote:
Hi,
I recently upgraded our
Chad,
This should not be a problem, given that all nodes have enough storage
available to handle the influx of data.
Thanks
-Mark
> On Mar 6, 2019, at 1:44 PM, Chad Woodhead wrote:
>
> Are there any negative effects of having filesystem mounts (dedicated mounts
> for each repo) used by the
On Feb 26, 2019, at 10:28 AM, Mark Payne
mailto:marka...@hotmail.com>> wrote:
Hi Shawn,
filename and uuid are attributes of the FlowFile. There's no fileSize attribute
(unless added explicitly by a processor). You can get the size of a FlowFile by
calling FlowFile.getSize()
Does that h
Hi Shawn,
filename and uuid are attributes of the FlowFile. There's no fileSize attribute
(unless added explicitly by a processor). You can get the size of a FlowFile by
calling FlowFile.getSize()
Does that help?
Thanks
-Mark
> On Feb 26, 2019, at 11:20 AM, Shawn Weeks wrote:
>
> Since
This is certainly a better route to go than my previous suggestion :) Have one
flow that grabs one of the datasets and stores it somewhere.
In a CSV or XML file, even. Then, have a second flow that pulls the other
dataset and uses LookupRecord to perform
the enrichment. The CSVLookupService and
Boris,
I would echo the cautions from Bryan & Joe. However, you could perceivably
achieve this by extracting out some id
into an attribute that would associate the two FlowFiles together (for example
'dataset.id'). Use MergeRecord or MergeContent
to merge the data together using that as a
lls, with small files as well as larger files (10s of MB vs GB sized
files).
Thanks again.
patw
On Mon, Feb 4, 2019 at 5:18 PM Mark Payne
mailto:marka...@hotmail.com>> wrote:
Hey Pat,
I saw this thread but have not yet had a chance to look into it. So thanks for
following up!
uestion on top of this, I can ask it as a
> separate thread if it makes more sense.
>
> Thanks
>
>
>> On Feb 13, 2019, at 12:50 PM, Mark Payne wrote:
>>
>> Vijay,
>>
>> This would be treated as arguments to a single command.
>>
>> One option
their default values I could be under-utilizing the resources (storage, mem,
CPU, etc.) I have on my servers dedicated to NiFi?
-Chad
On Wed, Feb 13, 2019 at 1:48 PM Mark Payne
mailto:marka...@hotmail.com>> wrote:
Hey Chad,
What do you have for the value of the
"nifi.provenance.re
Vijay,
This would be treated as arguments to a single command.
One option would be to create a simple bash script that executes the desired
commands and invoke
that from the processor. Or, of course, you can chain together multiple
processors.
Thanks
-Mark
> On Feb 13, 2019, at 1:48 PM,
Hey Chad,
What do you have for the value of the
"nifi.provenance.repository.max.storage.size" property?
We will often see this if the value is very small (the default is 1 GB, which
is very small) and the volume
of data is reasonably high.
The way that the repo works, it writes to one file for
:46 AM, Mike Thomsen
mailto:mikerthom...@gmail.com>> wrote:
Schema access strategy is inherit record schema and the version is 1.8.
Thanks,
Mike
On Wed, Feb 13, 2019 at 10:37 AM Mark Payne
mailto:marka...@hotmail.com>> wrote:
Mike,
That should fine. The NullPointerException seems
e any code
> that needs to be added to the NAR so the processor’s metrics are reported in
> this category or does NiFi handle that automatically?
>
> -Chad
>
>
>> On Feb 13, 2019, at 9:44 AM, Mark Payne wrote:
>>
>> Chad,
>>
>> It represent
Chad,
It represents any FlowFile that was received from an external source. This
could be via
Site-to-Site or could be from something like GetHTTP, FetchSFTP, etc. It
correlates
to any RECEIVE or FETCH provenance events.
If you go to the Summary table (menu in the top right / Summary) and then
Matt,
That would work if you want to select distinct records in a given FlowFIle but
not across FlowFiles.
PartitionRecord -> UpdateAttribute (optionally to combine multiple attributes
into one) -> DetectDuplicate
would work, but given that you expect the records to be unique generally, this
We do not. I've thought about it, but I have not had a chance to put any work
towards it. My vision of how it would work would be to
allow user to specify N number of RecordPath values as user-defined properties.
Then have those values extracted out and another
Record would be considered a
eally frustrating and I don't want to go back to RPG
if possible, although that has been rock solid. Will keep an eye out for 1.9!
Regards
Dano
On Thu, Jan 17, 2019, 5:17 PM Mark Payne
mailto:marka...@hotmail.com> wrote:
Hey Dan,
This can happen even within a process group, it is
Hey Pat,
I saw this thread but have not yet had a chance to look into it. So thanks for
following up!
The embedded server is handled in the JettyServer class [1]. I can imagine that
it may automatically turn on
GZIP. When pushing data, though, the client would be the one supplying the
stream
value of the REST_URL variable, then use that as the name of a
variable."
Doesn't seem to make much sense: the plan is to run "endsWith" on value of
REST_URL, doesn't your expression set to tun it on literal string "REST_URL"
within {...} ?
Thank you,
On Fri, Feb 1, 2019 at 8:58 AM
Hello,
The issue appears to be that you're using `${${REST_URL}:endsWith`
at the beginning, which is saying "resolve the value of the REST_URL variable,
then use that as the name of a variable."
So it's looking for a variable named "http://localhost:8080/nifi-api; and never
finds it. So
Aurelien,
What timing - I just submitted a Pull Request [1] for this issue [2] yesterday!
If you're the type who is inclined to build the branch and test it out, please
do so.
Otherwise, I suspect it will make it into the next release.
Thanks
-Mark
[1] https://github.com/apache/nifi/pull/3279
It looks like you are attempting to send a header named "Proxy Type". I don't
believe that HTTP Headers
are allowed to have spaces. So you'll want to check your configuration of the
InvokeHTTP processor and see
why it's trying to send that header.
On Jan 30, 2019, at 12:01 PM, l vic
Boris,
On the Settings tab, have you changed the value of the "Yield Duration"? The
default, I believe, is 1 second.
I would recommend that you change that to "0 sec" and that may do the trick.
Thanks
-Mark
On Jan 24, 2019, at 3:30 PM, Boris Tyukin
mailto:bo...@boristyukin.com>> wrote:
any
worth considering a release soon.
Sent from my iPhone
On Jan 17, 2019, at 6:59 PM, dan young
mailto:danoyo...@gmail.com>> wrote:
Hello Mark,
We're seeing "stuck" flow files again, this time within a PG...see attached
screen shots :(
On Fri, Dec 28, 2018 at 8:43 AM Mark Pay
king good so far and will look
forward to 1.9.
Regards,
Dano
On Fri, Dec 28, 2018 at 8:43 AM Mark Payne
mailto:marka...@hotmail.com>> wrote:
Dan, et al,
Great news! I was able to replicate this issue finally, by creating a
Load-Balanced connection
between two Process Groups/Ports instead of
sors, could be as well that one of
them causing it? Can someone share a (java) codesnipplet which ensures that a
custom processor doesn’t keep the flowfiles in content repo?
Cheers Josef
From: Mark Payne mailto:marka...@hotmail.com>>
Reply-To: "users@nifi.apache.org<mailto:u
Josef,
Thanks for the info! There are a few things to consider here. Firstly, you said
that you are using NiFi 1.8.0.
Are you using the new Load Balancing capability? I.e., do you have any
Connections configured to balance
load across your cluster? And if so, are you load-balancing any 0-byte
t 6:13 PM Mark Payne
mailto:marka...@hotmail.com>> wrote:
Ok, I just wanted to confirm that when you said “once it rejoins the cluster
that flow file is gone” that you mean “the flowfile did not exist on the
system” and NOT “the queue size was 0 by the time that I looked at the UI.”
I.e., is
, 2018 at 6:13 PM Mark Payne
mailto:marka...@hotmail.com>> wrote:
Ok, I just wanted to confirm that when you said “once it rejoins the cluster
that flow file is gone” that you mean “the flowfile did not exist on the
system” and NOT “the queue size was 0 by the time that I looked at t
s disconnected I restart nifi, and then once it rejoins the cluster that
flow file is gone. If we try to empty the queue, it will just say that there no
flow files in the queue.
On Wed, Dec 26, 2018, 5:22 PM Mark Payne
mailto:marka...@hotmail.com> wrote:
Hey Dan,
Thanks, this is super useful
nifi 1.8
Heya Mark,
So I added a Log Attribute Processor and routed the connection that had the
"stuck" flowfile to it. I ran a get diagnostics to the Log Attribute
processor before I started it, and then ran another diagnostics after I started
it. The flowfile stayed in
r id should I be trying to
gather the diagnostics on? the the queue is in between two processor groups.
Maybe the issue with the Unknown User has to do with some policy I don't have
set correctly?
Happy Holidays!
Regards,
Dano
On Wed, Dec 19, 2018 at 6:51 AM Mark Payne
mailto:marka...@ho
we want to grab it when we see the
stuck Flowfile in a queue, correct?
Also, does the nifi thread dump provide anything? This was from the node that
seemed to have the stuck Flowfile...
Dano
On Wed, Dec 19, 2018, 6:51 AM Mark Payne
mailto:marka...@hotmail.com> wrote:
Hey Josef, Dano,
Firstly
/nifi-docs/html/user-guide.html#Summary_Page
On Dec 19, 2018, at 2:18 AM,
mailto:josef.zahn...@swisscom.com>>
mailto:josef.zahn...@swisscom.com>> wrote:
Hi Dano
Seems that the problem has been seen by a few people but until now nobody from
NiFi team really cared about it – except Mark Pay
Robert,
Thanks for reporting this - and for all the details! The good news is that it
is easy to replicate.
The better news is that it was pretty easy to fix :) I have created NIFI-5879
[1] to track this issue,
and there is a Pull Request up for it now.
Thanks!
-Mark
[1]
Hi Kien,
Thanks for the details! Can you tell us a bit more about how the Connection is
configured?
What is the Load Balance Strategy that you're using? Do you have Back-Pressure
enabled?
Is it configured for the default 10,000 FlowFiles / 1 GB, or have you changed
those settings?
Are you
19:00:00
On Wed, Nov 14, 2018 at 9:21 AM Mark Payne
mailto:marka...@hotmail.com>> wrote:
You should be able to do something like:
${now():divide( 8640 ):multiply( 8640)}
I.e., use integer division to divide by number of milliseconds in a day, which
gives you
the number of day
ps://twitter.com/alertlogic><https://www.linkedin.com/company/alert-logic>
From: Mark Payne mailto:marka...@hotmail.com>>
Sent: Friday, November 16, 2018 2:14 PM
To: users@nifi.apache.org<mailto:users@nifi.apache.org>
Subject: Re: Problem Debugging InvokeHTTP Processor in Nifi
Hi Jim,
Can you build a template of your flow and share that? If so, that's usually the
easiest way to try to
replicate the behavior and to understand exactly how your flow is configured.
Thanks
-Mark
On Nov 16, 2018, at 2:58 PM, Williams, Jim
mailto:jwilli...@alertlogic.com>> wrote:
Hello,
just 1 or 2 files). The
log (with default log levels) show no WARN or ERRORs…
Thanks in advance, Josef
From: Mark Payne mailto:marka...@hotmail.com>>
Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>"
mailto:users@nifi.apache.org>>
Date: Mond
You should be able to do something like:
${now():divide( 8640 ):multiply( 8640)}
I.e., use integer division to divide by number of milliseconds in a day, which
gives you
the number of days since epoch. Then multiply by 86,400,000 again to convert
from
days back to milliseconds. While
Hi Mandeep,
I think this may actually be a bug in QueryRecord, in the way that it's
handling the Expression Language.
Do please file a JIRA for that. In the meantime, you can probably work around
the issue using the replaceNull() method.
So if your expression was
Mike,
Is there new data coming into the Kafka topic? By default, when the Processor
is started, it uses
an auto commit offset of 'latest'. So that means that if you started the
Processor with this setting,
the commit offset is saved pointing to the end of the topic. So if no more data
is
Philippe,
Thanks for reporting this. I have been able to replicate the issue. I created
[1] to address the issue and just
posted a Pull Request.
Thanks
-Mark
[1] https://issues.apache.org/jira/browse/NIFI-5814
On Nov 12, 2018, at 5:36 AM, PEETERS Philippe
mailto:philippe.peet...@etnic.be>>
Hey Dan,
Have looked through the logs to see if there are any WARN or ERROR's indicating
what's going on?
Thanks
-Mark
On Nov 12, 2018, at 9:06 AM, dan young
mailto:danoyo...@gmail.com>> wrote:
Hello,
We have two processor groups connected via the new Load Balancing/Round Robin
queue.
Mandeep,
Thanks for reporting these! I put up a JIRA [1] that covers the cases #2 and
#4, and a Pull Request to address it.
I was not able to replicate #1. The result that I got back was [{"world":
"hello"}] - which is what I expect. Perhaps you
can share a template that shows your
Hi Vijay,
You may run into some problems if you try to build a single NiFi cluster that
spans multiple
data centers. NiFi depends on ZooKeeper to perform leader election to determine
which node
is the Cluster Coordinator and which is the Primary Node. ZooKeeper guidance
indicates that you
it'd help answering some
> use cases.
> We just need to be very clear about the processor behavior for each possible
> case/configuration.
>
> Pierre
>
> Le mar. 25 sept. 2018 à 16:19, Mark Payne
> mailto:marka...@hotmail.com>> a écrit :
>>
>>
Matt,
I think it's very dangerous to manipulate the behavior of the processor so
drastically based
on the presence or absence of an incoming connection. I think it is fair game,
however, to allow
for a new property to be added that indicates whether or not state is
maintained. Then, the
Hey Matthew,
Sorry about the confusion. I think the problem is around the "Tailing mode"
property. In your case,
you want to use "Single file" because your desire is to tail a single "rolling"
file. You'd want to use
"Multiple files" if you were tailing multiple independent files.
So you'll
Shawn,
I believe the issue that you're running into is that you're defining the
'flow_file' field of an Array of type Byte.
Which means that it is expecting as its value an object of type Byte[] but you
are passing it an object of type byte[].
You'd have to create a Byte[] instead, using the
e information within
the authorizers file, however since we use LDAP, I believe that isn't
necessary? Or is it similar to the set up of nifi-to-nifi clustering?
Cheers,
Nikhil C.
On Thu, Sep 13, 2018 at 8:40 PM Mark Payne
mailto:marka...@hotmail.com>> wrote:
Hi Nikhil,
The property that
Hi Nikhil,
The property that you mention: "nifi.registry.security.needClientAuth" applies
only to user logins.
This allows users to login via certificate or other methods by not requiring
that they present a client
certificate. However, NiFi & registry require mutual authentication for all
Christoffer,
FlowFile expiration isn't really something that's available in the stats
currently.
There is an EXPIRE Provenance Event that is emitted whenever a FlowFile
gets expired, so you could make use of the SiteToSiteProvenanceReportingTask,
if you wanted to, in order to glean this
Nikhil,
Are you running a NiFi cluster or a single/standalone node?
If you're running standalone, state is stored locally on disk. By default, it
is stored in ./state/local
and so you'll want to ensure that you copy over the state/ directory from your
previous install to
the new install or
Hey Shawn,
It sounds like you need to set the cvs reader’s “Treat First Line as Header”
property to true. By default it treats the first line as the first record (as
opposed to the header), which looks like the case here.
Sent from my iPhone
On Aug 18, 2018, at 1:30 PM, Shawn Weeks
IFI-5482
On Aug 17, 2018, at 10:26 AM, Daniel Watson
mailto:dcwatso...@gmail.com>> wrote:
Yes.
On Fri, Aug 17, 2018 at 10:19 AM Mark Payne
mailto:marka...@hotmail.com>> wrote:
OK, thanks. Are you using the default implementation of the Provenance
Repository? I.e., the PersistentP
Aug 17, 2018 at 10:09 AM Mark Payne
mailto:marka...@hotmail.com>> wrote:
Hi Daniel,
What version of NiFi are you running?
Thanks
-Mark
> On Aug 17, 2018, at 10:07 AM, Daniel Watson
> mailto:dcwatso...@gmail.com>> wrote:
>
> Anyone have any issues with the data lineage
Hey Bob,
The InferAvroSchema processor actually works against JSON and CSV data. It is
designed to infer an Avro Schema
so that you can convert either of those into Avro if you want. So you can send
it JSON data and it will infer the schema for
you and put it in the "inferred.avro.schema"
Hi Daniel,
What version of NiFi are you running?
Thanks
-Mark
> On Aug 17, 2018, at 10:07 AM, Daniel Watson wrote:
>
> Anyone have any issues with the data lineage screen? My NiFi instance can't
> compute the data lineage for a specific flow. It worked originally, then
> after running
Jim,
I'd recommend RouteText. ScanContent would also be an alternative.
Thanks
-Mark
> On Aug 15, 2018, at 2:02 PM, James McMahon wrote:
>
> Good afternoon. I have a requirement to search for and detect a pattern
> "request":"false" is anywhere in the content of a flowfile. The content is
Joe G,
Also, to clarify, when you say "when we add receiving Site-to-Site traffic to
the mix, the CPU spikes to the point that the nodes can't talk to each other,
resulting in the inability to view or modify the flow in the console"
what exactly does "when we add receiving Site-to-stie traffic
Boris,
Using a Record-based processor does not mean that you need to define a schema
upfront. This is
necessary if the source itself cannot provide a schema. However, since it is
pulling structured data
and the schema can be inferred from the database, you wouldn't need to. As Matt
was saying,
hink I will mull this over for a bit
-Ryan
On Thu, Jul 26, 2018 at 9:57 AM, Mark Payne
mailto:marka...@hotmail.com>> wrote:
Hey Ryan,
The stats that you are seeing here is a rolling 5-minute window. The
"bytesReceived" indicates the number of bytes that were received from ex
KT,
I can confirm that this is the behavior I'm seeing as well. I went ahead and
created a JIRA [1]
for this. I think the bug really is in the fact that we allow you to start the
Port at all. Just like some
Processors are annotated as Requiring Input in order to be valid, ports should
be too
Hey Ryan,
The stats that you are seeing here is a rolling 5-minute window. The
"bytesReceived" indicates the number of bytes that were received from external
systems (i.e., the number of bytes reported as Provenance RECEIVE events). The
"bytesSent' indicates the number of bytes that were sent
Travis,
If you Configure the FetchFile processor and go to the Scheduling tab, is it
Timer-Driven with a Run Schedule of "0 secs"?
Can you try going to in your browser to /nifi-api/processors//diagnostics and send the JSON that it returns to the
mailing list?
I.e., if you go to
Hi Vyshali,
Have you looked into the TailFile processor? I believe that will do what you
are asking for.
Thanks
-Mark
Sent from my iPhone
On Jul 18, 2018, at 8:30 AM, N, Vyshali
mailto:vyshal...@honeywell.com>> wrote:
Hi,
I have a CSV file and the data gets appended to that then and there.
Jean-Sébastien,
It will create a new process for each FlowFile in the queue.
Thanks
-Mark
On Jul 13, 2018, at 8:55 AM, Jean-Sebastien Vachon
mailto:jsvac...@brizodata.com>> wrote:
Hi
Let says I have an external process that I am running using an
ExecuteStreamCommand, will Nifi keep the
Kiran,
What do you have set for the "Maximum number of Bins" property of MergeContent?
Each 'zip bundle' will have all of the FlowFiles added to the same bucket.
So if you have more 'zip bundles' coming in than you have available buckets,
it will evict one of the bins before all of its FlowFiles
a:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at
com.rabbitmq.jms.client.RMQMessage.instantiateRmqMessage(RMQMessage.java:1064)
Shawn,
There are certainly a few different ways to handle this. Unfortunately, it may
not be as simple and straight-forward
as you might expect, because you're trying to replace text with some
non-readable characters. Jagrut's method should
certainly work. However, a simpler solution, I think,
Faisal,
How much heap do you have allocated to your NiFi instance? In
conf/bootstrap.conf
the default value is 512 MB. If you haven't changed that, you could be just
running out of
heap.
Also, have you changed the maximum number of threads available to your NiFi
instance?
In the top-right
Prashanth,
Also of note, are you actually updating any fields in the CSV that you receive
with UpdateRecord / your custom processor?
Or are you just using that to convert the CSV to Avro? If the latter, you can
actually just remove this processor from your flow
entirely and simply use
ply. Please find the comments inline.
Thanks & Regards,
Prashanth
From: Mark Payne [mailto:marka...@hotmail.com]
Sent: Wednesday, June 13, 2018 6:07 PM
To: users@nifi.apache.org<mailto:users@nifi.apache.org>
Subject: Re: NiFi Performance Analysis Clarification
Prashanth,
Whenever th
Martijn,
"As an aside, does DistributeLoad use backpressure to know what processor is /
is not available?"
- It depends on the value that you set for the Processor's "Distribution
Strategy." The default is
Round Robin, which means that if any of the connections applies Back Pressure,
then
Martijn,
Typically when I come across a set of processors like this, I go with an
approach like https://imgur.com/a/3Zh3FeN
So we have a DistributeLoad going to one of 24 different PutS3Object
processors. Each processor's 'failure'
relationship is then routed to a funnel, and that funnel just
Prashanth,
Whenever the FlowFile Repository performs a Checkpoint, it has to ensure that
it has flushed all data to disk
before continuing, so it performs an fsync() call so that any data buffered by
the Operating System is flushed
to disk as well. If you're using the same physical drive /
Tim,
Typically when I see that issue, it's due to OutOfMemory Errors or constant
garbage collection. How large is your heap?
FWIW, an alternative is to use "kill -3 " The -3 will cause java to
perform a thread dump. So you can do "cat run/nifi.pid | xargs kill -3"
On May 23, 2018, at 11:13
Guido,
Dynamically deploying extensions is very much on the roadmap. There is actually
a Feature Proposal [1]
written that outlines some of the ideas about how it would work. The idea being
that extensions would be
published to an Extension Registry, and NiFi would then interact with that
t.j...@open-insights.co.in<mailto:mohit.j...@open-insights.co.in>> wrote:
Hi Mark,
I set the Strict type checking to false, still it doesn’t allowed.
Thanks,
Mohit
From: Mark Payne <marka...@hotmail.com<mailto:marka...@hotmail.com>>
Sent: 02 May 2018 23:00
To: users@nifi
Aurélien,
In that case you're looking to merge about 500,000 FlowFiles into a single
FlowFile, so you'll
definitely want to use a cascading approach. I'd shoot for about 1 MB for the
first MergeRecord
and then merge 128 of those together for the second MergeRecord.
The provenance backpressure
ot;,"int"],"default":null},{"name":"age","type":["null","string"],"default":null}]}
It doesn’t pass the record to invalid relationship. But it keeps the file in
the queue prior to validateRecord processor.
Mohit
From
t is passing to the invalid
relationship. Instead it keeps on throwing bulletins keeping the flowfile in
the queue.
Any suggestion?
Mohit
From: Mark Payne <marka...@hotmail.com<mailto:marka...@hotmail.com>>
Sent: 02 April 2018 19:02
To: users@nifi.apache.org<mailto:users@n
convert 6-7k per second, which
though not optimum but quite better than 45-50 records per seconds.
Is there any other improvement I can do?
Mohit
From: Mark Payne <marka...@hotmail.com<mailto:marka...@hotmail.com>>
Sent: 02 April 2018 18:30
To: users@nifi.apache.org<mailto:users@n
Mohit,
I agree that 45-50 records per second is quite slow. I'm not very familiar with
the implementation of
ConvertCSVToAvro, but it may well be that it must perform some sort of
initialization for each FlowFile
that it receives, which would explain why it's fast for a single incoming
e connection.
How should be the behaviour for when the connection is shared?
Regards,
Leandro Lourenco
On Mon, Mar 26, 2018 at 8:49 PM, Mark Payne
<marka...@hotmail.com<mailto:marka...@hotmail.com>> wrote:
Leandro,
As far as I can tell, the processor is in fact behaving as you
Andre,
I knew this was possible but had no idea how. Thanks for the great explanation
and associates caveats!
-Mark
On Mar 24, 2018, at 1:04 AM, Andre
> wrote:
Ravi,
There are two ways of solving this.
One of them (suggested to me MapR
Hey Boris,
Using the UpdateAttribute and RouteOnAttribute approach is only necessary when
you want
to retry N number of times (or for some time period) and after that elapses to
treat the data
differently. Most of the time, though, what is used is to simply loop the
'failure' relationship back
ction and I cannot change it for now.
On Tue, 27 Feb 2018 at 11:17 Mike Thomsen
<mikerthom...@gmail.com<mailto:mikerthom...@gmail.com>> wrote:
That doesn't look like the right way to specify an empty array. This SO example
fits about what I'd expect:
https://stackoverflow.com/a/421
Juan,
So the scenario that you laid out in the NIFI-4893 is not one that I've
personally
encountered. What does it mean exactly to have an Avro schema with an "array"
type
that has a value? In the example that you laid out, the field has:
"type": {"type": "array", "items": "int" }, "default":
rg>
Date: 02/16/2018 06:00 PM
Subject: Re: [Data Flow] File content not read completely
Hi Mark,
Thanks for looking into this. I am trying to put in the components you have
suggested. I'll update.
Regards,
Valencia
Mark Payne ---02/15/2018 07:09:32 PM---Valenc
301 - 400 of 605 matches
Mail list logo