Re: Filter flowfiles by size from RouteOnAttribute

2019-05-17 Thread Mark Payne
Mike, You can reference ${fileSize}. This is the size of a FlowFile in bytes. Thanks -Mark On May 17, 2019, at 1:27 PM, Mike Thomsen mailto:mikerthom...@gmail.com>> wrote: Is there a way to get access to the flowfile size from EL? This said to use Content-Length, but that didn't appear to

Re: use attributed defined in the same UpdateAttribute processor

2019-05-16 Thread Mark Payne
operty descriptors that are scoped for attributes) Andy LoPresto alopre...@apache.org<mailto:alopre...@apache.org> alopresto.apa...@gmail.com<mailto:alopresto.apa...@gmail.com> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 On May 16, 2019, at 18:21, Mark Payne mailto:ma

Re: use attributed defined in the same UpdateAttribute processor

2019-05-16 Thread Mark Payne
We could probably handle this without the user interface changing. The processor could be smart enough to determine the proper ordering to evaluate the properties based on whether or not it references another property. For example if properties are: first.name = John last.name = Doe greeting =

Re: Using result of HTTP request to filter flowfiles

2019-04-16 Thread Mark Payne
Wolfgang, So the general pattern that you're running into here is that you want to ingest data from Source 1 (a database), then filter/route the data based on a lookup in a second system/dataset (the API). So, for this I would recommend that you look at the LookupRecord processor. So the flow,

Re: Merge identical JSON records to single JSON with subarray

2019-04-15 Thread Mark Payne
John, It sounds like you've got this working, which is great, so thanks for following up and sharing the transformation that you used. I'm curious, though, if you looked into using PartitionRecord instead of the Split -> JsonPath -> Merge path. PartitionRecord should make that simpler and will

Re: Merge? Notify? Wait?

2019-03-27 Thread Mark Payne
William, You should be able to perform this using the Wait/Notify pattern. But I think a simpler (and likely better performing) alternative would be to avoid splitting out into two directions at all and instead use a linear flow: ListFile -> AttributesToJson -> ReplaceText -> InvokeHttp ->

Re: Changing All Stopped Processors to Disabled

2019-03-27 Thread Mark Payne
Ryan, In later versions of NiFi (1.8.0 I believe) the validation is done in the background instead of in the web request, so stopped processors won't slow down the UI. We also do allow the user to select 1 or more components and disable/enable the components in bulk. Older versions did not

Re: Most efficient means to search for a character in flowFiles

2019-03-13 Thread Mark Payne
RouteOnContent may be a good solution. ScanContent is probably a more efficient recommendation, though. ReplaceText would probably also work well, but you don't want to use Evaluation Mode of Entire Text - you're buffering the content of the entire FlowFile into memory and running a regex over

Re: Nifi 1.9 FlowFile Repository failed to update , java.nio.file.FileAlreadyExistsException

2019-03-08 Thread Mark Payne
Thanks for reporting this, Viking! I have created a JIRA [1] to address this. I expect to have a PR up for it shortly. Thanks -Mark [1] https://issues.apache.org/jira/browse/NIFI-6110 On Mar 8, 2019, at 4:37 AM, Viking K mailto:cyber_v...@hotmail.com>> wrote: Hi, I recently upgraded our

Re: Different NiFi Node sizes within same cluster

2019-03-06 Thread Mark Payne
Chad, This should not be a problem, given that all nodes have enough storage available to handle the influx of data. Thanks -Mark > On Mar 6, 2019, at 1:44 PM, Chad Woodhead wrote: > > Are there any negative effects of having filesystem mounts (dedicated mounts > for each repo) used by the

Re: TestRunner fileSize Attribute

2019-02-26 Thread Mark Payne
On Feb 26, 2019, at 10:28 AM, Mark Payne mailto:marka...@hotmail.com>> wrote: Hi Shawn, filename and uuid are attributes of the FlowFile. There's no fileSize attribute (unless added explicitly by a processor). You can get the size of a FlowFile by calling FlowFile.getSize() Does that h

Re: TestRunner fileSize Attribute

2019-02-26 Thread Mark Payne
Hi Shawn, filename and uuid are attributes of the FlowFile. There's no fileSize attribute (unless added explicitly by a processor). You can get the size of a FlowFile by calling FlowFile.getSize() Does that help? Thanks -Mark > On Feb 26, 2019, at 11:20 AM, Shawn Weeks wrote: > > Since

Re: join two datasets

2019-02-22 Thread Mark Payne
This is certainly a better route to go than my previous suggestion :) Have one flow that grabs one of the datasets and stores it somewhere. In a CSV or XML file, even. Then, have a second flow that pulls the other dataset and uses LookupRecord to perform the enrichment. The CSVLookupService and

Re: join two datasets

2019-02-22 Thread Mark Payne
Boris, I would echo the cautions from Bryan & Joe. However, you could perceivably achieve this by extracting out some id into an attribute that would associate the two FlowFiles together (for example 'dataset.id'). Use MergeRecord or MergeContent to merge the data together using that as a

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

2019-02-14 Thread Mark Payne
lls, with small files as well as larger files (10s of MB vs GB sized files). Thanks again. patw On Mon, Feb 4, 2019 at 5:18 PM Mark Payne mailto:marka...@hotmail.com>> wrote: Hey Pat, I saw this thread but have not yet had a chance to look into it. So thanks for following up!

Re: running multiple commands from a single ExecuteStreamCommand processor

2019-02-13 Thread Mark Payne
uestion on top of this, I can ask it as a > separate thread if it makes more sense. > > Thanks > > >> On Feb 13, 2019, at 12:50 PM, Mark Payne wrote: >> >> Vijay, >> >> This would be treated as arguments to a single command. >> >> One option

Re: Failed to read TOC File

2019-02-13 Thread Mark Payne
their default values I could be under-utilizing the resources (storage, mem, CPU, etc.) I have on my servers dedicated to NiFi? -Chad On Wed, Feb 13, 2019 at 1:48 PM Mark Payne mailto:marka...@hotmail.com>> wrote: Hey Chad, What do you have for the value of the "nifi.provenance.re

Re: running multiple commands from a single ExecuteStreamCommand processor

2019-02-13 Thread Mark Payne
Vijay, This would be treated as arguments to a single command. One option would be to create a simple bash script that executes the desired commands and invoke that from the processor. Or, of course, you can chain together multiple processors. Thanks -Mark > On Feb 13, 2019, at 1:48 PM,

Re: Failed to read TOC File

2019-02-13 Thread Mark Payne
Hey Chad, What do you have for the value of the "nifi.provenance.repository.max.storage.size" property? We will often see this if the value is very small (the default is 1 GB, which is very small) and the volume of data is reasonably high. The way that the repo works, it writes to one file for

Re: QueryRecord fails on nullable records

2019-02-13 Thread Mark Payne
:46 AM, Mike Thomsen mailto:mikerthom...@gmail.com>> wrote: Schema access strategy is inherit record schema and the version is 1.8. Thanks, Mike On Wed, Feb 13, 2019 at 10:37 AM Mark Payne mailto:marka...@hotmail.com>> wrote: Mike, That should fine. The NullPointerException seems

Re: AmbariReportingTask help

2019-02-13 Thread Mark Payne
e any code > that needs to be added to the NAR so the processor’s metrics are reported in > this category or does NiFi handle that automatically? > > -Chad > > >> On Feb 13, 2019, at 9:44 AM, Mark Payne wrote: >> >> Chad, >> >> It represent

Re: AmbariReportingTask help

2019-02-13 Thread Mark Payne
Chad, It represents any FlowFile that was received from an external source. This could be via Site-to-Site or could be from something like GetHTTP, FetchSFTP, etc. It correlates to any RECEIVE or FETCH provenance events. If you go to the Summary table (menu in the top right / Summary) and then

Re: Record-oriented DetectDuplicate?

2019-02-08 Thread Mark Payne
Matt, That would work if you want to select distinct records in a given FlowFIle but not across FlowFiles. PartitionRecord -> UpdateAttribute (optionally to combine multiple attributes into one) -> DetectDuplicate would work, but given that you expect the records to be unique generally, this

Re: Record-oriented DetectDuplicate?

2019-02-08 Thread Mark Payne
We do not. I've thought about it, but I have not had a chance to put any work towards it. My vision of how it would work would be to allow user to specify N number of RecordPath values as user-defined properties. Then have those values extracted out and another Record would be considered a

Re: flowfiles stuck in load balanced queue; nifi 1.8

2019-02-08 Thread Mark Payne
eally frustrating and I don't want to go back to RPG if possible, although that has been rock solid. Will keep an eye out for 1.9! Regards Dano On Thu, Jan 17, 2019, 5:17 PM Mark Payne mailto:marka...@hotmail.com> wrote: Hey Dan, This can happen even within a process group, it is

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

2019-02-04 Thread Mark Payne
Hey Pat, I saw this thread but have not yet had a chance to look into it. So thanks for following up! The embedded server is handled in the JettyServer class [1]. I can imagine that it may automatically turn on GZIP. When pushing data, though, the client would be the one supplying the stream

Re: expression failure in URL concatenation?

2019-02-01 Thread Mark Payne
value of the REST_URL variable, then use that as the name of a variable." Doesn't seem to make much sense: the plan is to run "endsWith" on value of REST_URL, doesn't your expression set to tun it on literal string "REST_URL" within {...} ? Thank you, On Fri, Feb 1, 2019 at 8:58 AM

Re: expression failure in URL concatenation?

2019-02-01 Thread Mark Payne
Hello, The issue appears to be that you're using `${${REST_URL}:endsWith` at the beginning, which is saying "resolve the value of the REST_URL variable, then use that as the name of a variable." So it's looking for a variable named "http://localhost:8080/nifi-api; and never finds it. So

Re: PutKudu & Kerberos

2019-01-31 Thread Mark Payne
Aurelien, What timing - I just submitted a Pull Request [1] for this issue [2] yesterday! If you're the type who is inclined to build the branch and test it out, please do so. Otherwise, I suspect it will make it into the next release. Thanks -Mark [1] https://github.com/apache/nifi/pull/3279

Re: Unexpected char exception in InvokeHTTP

2019-01-30 Thread Mark Payne
It looks like you are attempting to send a header named "Proxy Type". I don't believe that HTTP Headers are allowed to have spaces. So you'll want to check your configuration of the InvokeHTTP processor and see why it's trying to send that header. On Jan 30, 2019, at 12:01 PM, l vic

Re: NiFi consumers concurrency

2019-01-24 Thread Mark Payne
Boris, On the Settings tab, have you changed the value of the "Yield Duration"? The default, I believe, is 1 second. I would recommend that you change that to "0 sec" and that may do the trick. Thanks -Mark On Jan 24, 2019, at 3:30 PM, Boris Tyukin mailto:bo...@boristyukin.com>> wrote: any

Re: flowfiles stuck in load balanced queue; nifi 1.8

2019-01-17 Thread Mark Payne
worth considering a release soon. Sent from my iPhone On Jan 17, 2019, at 6:59 PM, dan young mailto:danoyo...@gmail.com>> wrote: Hello Mark, We're seeing "stuck" flow files again, this time within a PG...see attached screen shots :( On Fri, Dec 28, 2018 at 8:43 AM Mark Pay

Re: flowfiles stuck in load balanced queue; nifi 1.8

2019-01-05 Thread Mark Payne
king good so far and will look forward to 1.9. Regards, Dano On Fri, Dec 28, 2018 at 8:43 AM Mark Payne mailto:marka...@hotmail.com>> wrote: Dan, et al, Great news! I was able to replicate this issue finally, by creating a Load-Balanced connection between two Process Groups/Ports instead of

Re: NiFi (De-)"Compress Content" Processor causes to fill up content_repo insanly fast by corrupt GZIP files

2019-01-04 Thread Mark Payne
sors, could be as well that one of them causing it? Can someone share a (java) codesnipplet which ensures that a custom processor doesn’t keep the flowfiles in content repo? Cheers Josef From: Mark Payne mailto:marka...@hotmail.com>> Reply-To: "users@nifi.apache.org<mailto:u

Re: NiFi (De-)"Compress Content" Processor causes to fill up content_repo insanly fast by corrupt GZIP files

2019-01-04 Thread Mark Payne
Josef, Thanks for the info! There are a few things to consider here. Firstly, you said that you are using NiFi 1.8.0. Are you using the new Load Balancing capability? I.e., do you have any Connections configured to balance load across your cluster? And if so, are you load-balancing any 0-byte

Re: flowfiles stuck in load balanced queue; nifi 1.8

2018-12-28 Thread Mark Payne
t 6:13 PM Mark Payne mailto:marka...@hotmail.com>> wrote: Ok, I just wanted to confirm that when you said “once it rejoins the cluster that flow file is gone” that you mean “the flowfile did not exist on the system” and NOT “the queue size was 0 by the time that I looked at the UI.” I.e., is

Re: flowfiles stuck in load balanced queue; nifi 1.8

2018-12-26 Thread Mark Payne
, 2018 at 6:13 PM Mark Payne mailto:marka...@hotmail.com>> wrote: Ok, I just wanted to confirm that when you said “once it rejoins the cluster that flow file is gone” that you mean “the flowfile did not exist on the system” and NOT “the queue size was 0 by the time that I looked at t

Re: flowfiles stuck in load balanced queue; nifi 1.8

2018-12-26 Thread Mark Payne
s disconnected I restart nifi, and then once it rejoins the cluster that flow file is gone. If we try to empty the queue, it will just say that there no flow files in the queue. On Wed, Dec 26, 2018, 5:22 PM Mark Payne mailto:marka...@hotmail.com> wrote: Hey Dan, Thanks, this is super useful

Re: flowfiles stuck in load balanced queue; nifi 1.8

2018-12-26 Thread Mark Payne
nifi 1.8 Heya Mark, So I added a Log Attribute Processor and routed the connection that had the "stuck" flowfile to it. I ran a get diagnostics to the Log Attribute processor before I started it, and then ran another diagnostics after I started it. The flowfile stayed in

Re: flowfiles stuck in load balanced queue; nifi 1.8

2018-12-24 Thread Mark Payne
r id should I be trying to gather the diagnostics on? the the queue is in between two processor groups. Maybe the issue with the Unknown User has to do with some policy I don't have set correctly? Happy Holidays! Regards, Dano On Wed, Dec 19, 2018 at 6:51 AM Mark Payne mailto:marka...@ho

Re: flowfiles stuck in load balanced queue; nifi 1.8

2018-12-19 Thread Mark Payne
we want to grab it when we see the stuck Flowfile in a queue, correct? Also, does the nifi thread dump provide anything? This was from the node that seemed to have the stuck Flowfile... Dano On Wed, Dec 19, 2018, 6:51 AM Mark Payne mailto:marka...@hotmail.com> wrote: Hey Josef, Dano, Firstly

Re: flowfiles stuck in load balanced queue; nifi 1.8

2018-12-19 Thread Mark Payne
/nifi-docs/html/user-guide.html#Summary_Page On Dec 19, 2018, at 2:18 AM, mailto:josef.zahn...@swisscom.com>> mailto:josef.zahn...@swisscom.com>> wrote: Hi Dano Seems that the problem has been seen by a few people but until now nobody from NiFi team really cared about it – except Mark Pay

Re: ContentNotFoundException while manipulating Cloned FlowFile

2018-12-06 Thread Mark Payne
Robert, Thanks for reporting this - and for all the details! The good news is that it is easy to replicate. The better news is that it was pretty easy to fix :) I have created NIFI-5879 [1] to track this issue, and there is a Pull Request up for it now. Thanks! -Mark [1]

Re: Load Balancing connection issue

2018-12-03 Thread Mark Payne
Hi Kien, Thanks for the details! Can you tell us a bit more about how the Connection is configured? What is the Load Balance Strategy that you're using? Do you have Back-Pressure enabled? Is it configured for the default 10,000 FlowFiles / 1 GB, or have you changed those settings? Are you

Re: Getting a timestamp for today at midnight?

2018-11-16 Thread Mark Payne
19:00:00 On Wed, Nov 14, 2018 at 9:21 AM Mark Payne mailto:marka...@hotmail.com>> wrote: You should be able to do something like: ${now():divide( 8640 ):multiply( 8640)} I.e., use integer division to divide by number of milliseconds in a day, which gives you the number of day

Re: Problem Debugging InvokeHTTP Processor in Nifi 1.8.0

2018-11-16 Thread Mark Payne
ps://twitter.com/alertlogic><https://www.linkedin.com/company/alert-logic> From: Mark Payne mailto:marka...@hotmail.com>> Sent: Friday, November 16, 2018 2:14 PM To: users@nifi.apache.org<mailto:users@nifi.apache.org> Subject: Re: Problem Debugging InvokeHTTP Processor in Nifi

Re: Problem Debugging InvokeHTTP Processor in Nifi 1.8.0

2018-11-16 Thread Mark Payne
Hi Jim, Can you build a template of your flow and share that? If so, that's usually the easiest way to try to replicate the behavior and to understand exactly how your flow is configured. Thanks -Mark On Nov 16, 2018, at 2:58 PM, Williams, Jim mailto:jwilli...@alertlogic.com>> wrote: Hello,

Re: NiFi 1.8 and stuck flowfile in Load Balanced enabled queue

2018-11-16 Thread Mark Payne
just 1 or 2 files). The log (with default log levels) show no WARN or ERRORs… Thanks in advance, Josef From: Mark Payne mailto:marka...@hotmail.com>> Reply-To: "users@nifi.apache.org<mailto:users@nifi.apache.org>" mailto:users@nifi.apache.org>> Date: Mond

Re: Getting a timestamp for today at midnight?

2018-11-14 Thread Mark Payne
You should be able to do something like: ${now():divide( 8640 ):multiply( 8640)} I.e., use integer division to divide by number of milliseconds in a day, which gives you the number of days since epoch. Then multiply by 86,400,000 again to convert from days back to milliseconds. While

Re: JsonPath expression language exception with QueryRecord processor

2018-11-13 Thread Mark Payne
Hi Mandeep, I think this may actually be a bug in QueryRecord, in the way that it's handling the Expression Language. Do please file a JIRA for that. In the meantime, you can probably work around the issue using the replaceNull() method. So if your expression was

Re: ConsumeKafkaRecord won't pull new events from Kafka

2018-11-13 Thread Mark Payne
Mike, Is there new data coming into the Kafka topic? By default, when the Processor is started, it uses an auto commit offset of 'latest'. So that means that if you started the Processor with this setting, the commit offset is saved pointing to the end of the topic. So if no more data is

Re: Issue with GeoEnrichIP in NiFi 1.8.0

2018-11-12 Thread Mark Payne
Philippe, Thanks for reporting this. I have been able to replicate the issue. I created [1] to address the issue and just posted a Pull Request. Thanks -Mark [1] https://issues.apache.org/jira/browse/NIFI-5814 On Nov 12, 2018, at 5:36 AM, PEETERS Philippe mailto:philippe.peet...@etnic.be>>

Re: NiFi 1.8 and stuck flowfile in Load Balanced enabled queue

2018-11-12 Thread Mark Payne
Hey Dan, Have looked through the logs to see if there are any WARN or ERROR's indicating what's going on? Thanks -Mark On Nov 12, 2018, at 9:06 AM, dan young mailto:danoyo...@gmail.com>> wrote: Hello, We have two processor groups connected via the new Load Balancing/Round Robin queue.

Re: Nulls in input data throwing exceptions when using QueryRecord

2018-11-09 Thread Mark Payne
Mandeep, Thanks for reporting these! I put up a JIRA [1] that covers the cases #2 and #4, and a Pull Request to address it. I was not able to replicate #1. The result that I got back was [{"world": "hello"}] - which is what I expect. Perhaps you can share a template that shows your

Re: Cluster setup in multiple data centers

2018-10-22 Thread Mark Payne
Hi Vijay, You may run into some problems if you try to build a single NiFi cluster that spans multiple data centers. NiFi depends on ZooKeeper to perform leader election to determine which node is the Cluster Coordinator and which is the Primary Node. ZooKeeper guidance indicates that you

Re: Listing S3

2018-09-25 Thread Mark Payne
it'd help answering some > use cases. > We just need to be very clear about the processor behavior for each possible > case/configuration. > > Pierre > > Le mar. 25 sept. 2018 à 16:19, Mark Payne > mailto:marka...@hotmail.com>> a écrit : >> >>

Re: Listing S3

2018-09-25 Thread Mark Payne
Matt, I think it's very dangerous to manipulate the behavior of the processor so drastically based on the presence or absence of an incoming connection. I think it is fair game, however, to allow for a new property to be added that indicates whether or not state is maintained. Then, the

Re: TailFile State Continuously Increases & Duplicated Events

2018-09-19 Thread Mark Payne
Hey Matthew, Sorry about the confusion. I think the problem is around the "Tailing mode" property. In your case, you want to use "Single file" because your desire is to tail a single "rolling" file. You'd want to use "Multiple files" if you were tailing multiple independent files. So you'll

Re: Scripted Record Reader - Missing Something Obvious

2018-09-17 Thread Mark Payne
Shawn, I believe the issue that you're running into is that you're defining the 'flow_file' field of an Array of type Byte. Which means that it is expecting as its value an object of type Byte[] but you are passing it an object of type byte[]. You'd have to create a Byte[] instead, using the

Re: Certificate Issue when connecting to NiFi Registry

2018-09-17 Thread Mark Payne
e information within the authorizers file, however since we use LDAP, I believe that isn't necessary? Or is it similar to the set up of nifi-to-nifi clustering? Cheers, Nikhil C. On Thu, Sep 13, 2018 at 8:40 PM Mark Payne mailto:marka...@hotmail.com>> wrote: Hi Nikhil, The property that

Re: Certificate Issue when connecting to NiFi Registry

2018-09-13 Thread Mark Payne
Hi Nikhil, The property that you mention: "nifi.registry.security.needClientAuth" applies only to user logins. This allows users to login via certificate or other methods by not requiring that they present a client certificate. However, NiFi & registry require mutual authentication for all

Re: Monitoring expiration of flowfiles

2018-09-11 Thread Mark Payne
Christoffer, FlowFile expiration isn't really something that's available in the stats currently. There is an EXPIRE Provenance Event that is emitted whenever a FlowFile gets expired, so you could make use of the SiteToSiteProvenanceReportingTask, if you wanted to, in order to glean this

Re: ListS3 Processor State Defaulted

2018-08-28 Thread Mark Payne
Nikhil, Are you running a NiFi cluster or a single/standalone node? If you're running standalone, state is stored locally on disk. By default, it is stored in ./state/local and so you'll want to ensure that you copy over the state/ directory from your previous install to the new install or

Re: CSV Illegal Initial Character

2018-08-18 Thread Mark Payne
Hey Shawn, It sounds like you need to set the cvs reader’s “Treat First Line as Header” property to true. By default it treats the first line as the first record (as opposed to the header), which looks like the case here. Sent from my iPhone On Aug 18, 2018, at 1:30 PM, Shawn Weeks

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Mark Payne
IFI-5482 On Aug 17, 2018, at 10:26 AM, Daniel Watson mailto:dcwatso...@gmail.com>> wrote: Yes. On Fri, Aug 17, 2018 at 10:19 AM Mark Payne mailto:marka...@hotmail.com>> wrote: OK, thanks. Are you using the default implementation of the Provenance Repository? I.e., the PersistentP

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Mark Payne
Aug 17, 2018 at 10:09 AM Mark Payne mailto:marka...@hotmail.com>> wrote: Hi Daniel, What version of NiFi are you running? Thanks -Mark > On Aug 17, 2018, at 10:07 AM, Daniel Watson > mailto:dcwatso...@gmail.com>> wrote: > > Anyone have any issues with the data lineage

Re: Design pattern advice needed

2018-08-17 Thread Mark Payne
Hey Bob, The InferAvroSchema processor actually works against JSON and CSV data. It is designed to infer an Avro Schema so that you can convert either of those into Avro if you want. So you can send it JSON data and it will infer the schema for you and put it in the "inferred.avro.schema"

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Mark Payne
Hi Daniel, What version of NiFi are you running? Thanks -Mark > On Aug 17, 2018, at 10:07 AM, Daniel Watson wrote: > > Anyone have any issues with the data lineage screen? My NiFi instance can't > compute the data lineage for a specific flow. It worked originally, then > after running

Re: Detect a pattern in incoming json content

2018-08-15 Thread Mark Payne
Jim, I'd recommend RouteText. ScanContent would also be an alternative. Thanks -Mark > On Aug 15, 2018, at 2:02 PM, James McMahon wrote: > > Good afternoon. I have a requirement to search for and detect a pattern > "request":"false" is anywhere in the content of a flowfile. The content is

Re: NiFi 1.6.0 cluster stability with Site-to-Site

2018-08-10 Thread Mark Payne
Joe G, Also, to clarify, when you say "when we add receiving Site-to-Site traffic to the mix, the CPU spikes to the point that the nodes can't talk to each other, resulting in the inability to view or modify the flow in the console" what exactly does "when we add receiving Site-to-stie traffic

Re: AVRO is the only output format with ExecuteSQL

2018-08-07 Thread Mark Payne
Boris, Using a Record-based processor does not mean that you need to define a schema upfront. This is necessary if the source itself cannot provide a schema. However, since it is pulling structured data and the schema can be inferred from the database, you wouldn't need to. As Matt was saying,

Re: NiFi Data Usage via Rest API

2018-07-26 Thread Mark Payne
hink I will mull this over for a bit -Ryan On Thu, Jul 26, 2018 at 9:57 AM, Mark Payne mailto:marka...@hotmail.com>> wrote: Hey Ryan, The stats that you are seeing here is a rolling 5-minute window. The "bytesReceived" indicates the number of bytes that were received from ex

Re: High CPU load upon starting non-connected Out Port inside PG

2018-07-26 Thread Mark Payne
KT, I can confirm that this is the behavior I'm seeing as well. I went ahead and created a JIRA [1] for this. I think the bug really is in the fact that we allow you to start the Port at all. Just like some Processors are annotated as Requiring Input in order to be valid, ports should be too

Re: NiFi Data Usage via Rest API

2018-07-26 Thread Mark Payne
Hey Ryan, The stats that you are seeing here is a rolling 5-minute window. The "bytesReceived" indicates the number of bytes that were received from external systems (i.e., the number of bytes reported as Provenance RECEIVE events). The "bytesSent' indicates the number of bytes that were sent

Re: Queue Issues

2018-07-20 Thread Mark Payne
Travis, If you Configure the FetchFile processor and go to the Scheduling tab, is it Timer-Driven with a Run Schedule of "0 secs"? Can you try going to in your browser to /nifi-api/processors//diagnostics and send the JSON that it returns to the mailing list? I.e., if you go to

Re: Identify new records of CSV file

2018-07-18 Thread Mark Payne
Hi Vyshali, Have you looked into the TailFile processor? I believe that will do what you are asking for. Thanks -Mark Sent from my iPhone On Jul 18, 2018, at 8:30 AM, N, Vyshali mailto:vyshal...@honeywell.com>> wrote: Hi, I have a CSV file and the data gets appended to that then and there.

Re: Question about how Nifi is handling ExecuteStreamCommand

2018-07-13 Thread Mark Payne
Jean-Sébastien, It will create a new process for each FlowFile in the queue. Thanks -Mark On Jul 13, 2018, at 8:55 AM, Jean-Sebastien Vachon mailto:jsvac...@brizodata.com>> wrote: Hi Let says I have an external process that I am running using an ExecuteStreamCommand, will Nifi keep the

Re: [EXT] Adding a file to a zip file

2018-07-11 Thread Mark Payne
Kiran, What do you have set for the "Maximum number of Bins" property of MergeContent? Each 'zip bundle' will have all of the FlowFiles added to the same bucket. So if you have more 'zip bundles' coming in than you have available buckets, it will evict one of the bins before all of its FlowFiles

Re: simple JMS pub-sub not working

2018-06-28 Thread Mark Payne
a:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at com.rabbitmq.jms.client.RMQMessage.instantiateRmqMessage(RMQMessage.java:1064)

Re: ReplaceText Special Character

2018-06-26 Thread Mark Payne
Shawn, There are certainly a few different ways to handle this. Unfortunately, it may not be as simple and straight-forward as you might expect, because you're trying to replace text with some non-readable characters. Jagrut's method should certainly work. However, a simpler solution, I think,

Re: Consumekafka error I

2018-06-14 Thread Mark Payne
Faisal, How much heap do you have allocated to your NiFi instance? In conf/bootstrap.conf the default value is 512 MB. If you haven't changed that, you could be just running out of heap. Also, have you changed the maximum number of threads available to your NiFi instance? In the top-right

Re: NiFi Performance Analysis Clarification

2018-06-13 Thread Mark Payne
Prashanth, Also of note, are you actually updating any fields in the CSV that you receive with UpdateRecord / your custom processor? Or are you just using that to convert the CSV to Avro? If the latter, you can actually just remove this processor from your flow entirely and simply use

Re: NiFi Performance Analysis Clarification

2018-06-13 Thread Mark Payne
ply. Please find the comments inline. Thanks & Regards, Prashanth From: Mark Payne [mailto:marka...@hotmail.com] Sent: Wednesday, June 13, 2018 6:07 PM To: users@nifi.apache.org<mailto:users@nifi.apache.org> Subject: Re: NiFi Performance Analysis Clarification Prashanth, Whenever th

Re: Fun with DistributeLoad

2018-06-13 Thread Mark Payne
Martijn, "As an aside, does DistributeLoad use backpressure to know what processor is / is not available?" - It depends on the value that you set for the Processor's "Distribution Strategy." The default is Round Robin, which means that if any of the connections applies Back Pressure, then

Re: Fun with DistributeLoad

2018-06-13 Thread Mark Payne
Martijn, Typically when I come across a set of processors like this, I go with an approach like https://imgur.com/a/3Zh3FeN So we have a DistributeLoad going to one of 24 different PutS3Object processors. Each processor's 'failure' relationship is then routed to a funnel, and that funnel just

Re: NiFi Performance Analysis Clarification

2018-06-13 Thread Mark Payne
Prashanth, Whenever the FlowFile Repository performs a Checkpoint, it has to ensure that it has flushed all data to disk before continuing, so it performs an fsync() call so that any data buffered by the Operating System is flushed to disk as well. If you're using the same physical drive /

Re: Provenance repository events stop being collected?

2018-05-23 Thread Mark Payne
Tim, Typically when I see that issue, it's due to OutOfMemory Errors or constant garbage collection. How large is your heap? FWIW, an alternative is to use "kill -3 " The -3 will cause java to perform a thread dump. So you can do "cat run/nifi.pid | xargs kill -3" On May 23, 2018, at 11:13

Re: Custom processors and restart

2018-05-04 Thread Mark Payne
Guido, Dynamically deploying extensions is very much on the roadmap. There is actually a Feature Proposal [1] written that outlines some of the ideas about how it would work. The idea being that extensions would be published to an Extension Registry, and NiFi would then interact with that

Re: Validate Record issue

2018-05-02 Thread Mark Payne
t.j...@open-insights.co.in<mailto:mohit.j...@open-insights.co.in>> wrote: Hi Mark, I set the Strict type checking to false, still it doesn’t allowed. Thanks, Mohit From: Mark Payne <marka...@hotmail.com<mailto:marka...@hotmail.com>> Sent: 02 May 2018 23:00 To: users@nifi

Re: MergeRecord, queue & backpressure

2018-04-13 Thread Mark Payne
Aurélien, In that case you're looking to merge about 500,000 FlowFiles into a single FlowFile, so you'll definitely want to use a cascading approach. I'd shoot for about 1 MB for the first MergeRecord and then merge 128 of those together for the second MergeRecord. The provenance backpressure

Re: ConvertCSVToAvro taking a lot of time when passing single record as an input.

2018-04-02 Thread Mark Payne
ot;,"int"],"default":null},{"name":"age","type":["null","string"],"default":null}]} It doesn’t pass the record to invalid relationship. But it keeps the file in the queue prior to validateRecord processor. Mohit From

Re: ConvertCSVToAvro taking a lot of time when passing single record as an input.

2018-04-02 Thread Mark Payne
t is passing to the invalid relationship. Instead it keeps on throwing bulletins keeping the flowfile in the queue. Any suggestion? Mohit From: Mark Payne <marka...@hotmail.com<mailto:marka...@hotmail.com>> Sent: 02 April 2018 19:02 To: users@nifi.apache.org<mailto:users@n

Re: ConvertCSVToAvro taking a lot of time when passing single record as an input.

2018-04-02 Thread Mark Payne
convert 6-7k per second, which though not optimum but quite better than 45-50 records per seconds. Is there any other improvement I can do? Mohit From: Mark Payne <marka...@hotmail.com<mailto:marka...@hotmail.com>> Sent: 02 April 2018 18:30 To: users@nifi.apache.org<mailto:users@n

Re: ConvertCSVToAvro taking a lot of time when passing single record as an input.

2018-04-02 Thread Mark Payne
Mohit, I agree that 45-50 records per second is quite slow. I'm not very familiar with the implementation of ConvertCSVToAvro, but it may well be that it must perform some sort of initialization for each FlowFile that it receives, which would explain why it's fast for a single incoming

Re: How does ControlRate Grouping Attribute really work?

2018-03-28 Thread Mark Payne
e connection. How should be the behaviour for when the connection is shared? Regards, Leandro Lourenco On Mon, Mar 26, 2018 at 8:49 PM, Mark Payne <marka...@hotmail.com<mailto:marka...@hotmail.com>> wrote: Leandro, As far as I can tell, the processor is in fact behaving as you

Re: PutHDFS with mapr

2018-03-24 Thread Mark Payne
Andre, I knew this was possible but had no idea how. Thanks for the great explanation and associates caveats! -Mark On Mar 24, 2018, at 1:04 AM, Andre > wrote: Ravi, There are two ways of solving this. One of them (suggested to me MapR

Re: NiFi retry capabilities

2018-03-06 Thread Mark Payne
Hey Boris, Using the UpdateAttribute and RouteOnAttribute approach is only necessary when you want to retry N number of times (or for some time period) and after that elapses to treat the data differently. Most of the time, though, what is used is to simply loop the 'failure' relationship back

Re: Cannot convert to Record a valid Avro schema

2018-02-27 Thread Mark Payne
ction and I cannot change it for now. On Tue, 27 Feb 2018 at 11:17 Mike Thomsen <mikerthom...@gmail.com<mailto:mikerthom...@gmail.com>> wrote: That doesn't look like the right way to specify an empty array. This SO example fits about what I'd expect: https://stackoverflow.com/a/421

Re: Cannot convert to Record a valid Avro schema

2018-02-27 Thread Mark Payne
Juan, So the scenario that you laid out in the NIFI-4893 is not one that I've personally encountered. What does it mean exactly to have an Avro schema with an "array" type that has a value? In the example that you laid out, the field has: "type": {"type": "array", "items": "int" }, "default":

Re: [Data Flow] File content not read completely

2018-02-21 Thread Mark Payne
rg> Date: 02/16/2018 06:00 PM Subject: Re: [Data Flow] File content not read completely Hi Mark, Thanks for looking into this. I am trying to put in the components you have suggested. I'll update. Regards, Valencia Mark Payne ---02/15/2018 07:09:32 PM---Valenc

<    1   2   3   4   5   6   7   >