Re: appending \n to a flow

2017-04-20 Thread Mark Payne
Phil, For your configuration of ReplaceText, I think you need a couple of changes. Firstly, you don't want to append a newline to the end of every line - you want to append it to the end of the FlowFile. So you'd want "Evaluation Mode" set to "Entire Text". Secondly, NiFi is not going to substi

Re: Clustering Best Practices?

2017-04-20 Thread Mark Payne
Jim, I would offer you a few bits of advice. First, NiFi relies on ZooKeeper to coordinate which node is responsible to act as the Cluster Coordinator and which node should be the Primary Node. NiFi does allow you to start and embedded ZooKeeper, but for production use, it is recommended that yo

Re: Preserve or replace 'state' directory during upgrade - NiFi 1.1.2

2017-03-30 Thread Mark Payne
Hi Wes, Moving the state directory outside of the nifi home directory is definitely a good idea. Otherwise processors would lose state so List* processors, for instance, would start over from the beginning. Thanks -Mark Sent from my iPhone > On Mar 30, 2017, at 7:07 PM, Wes Lawrence wrote:

Re: Nifi - Data provenance not reporting anymore

2017-03-30 Thread Mark Payne
Uwe, When you issue a Provenance Query, if it finds more results than the max (the UI requests 1000 results max) then it will simply return up to the max results. It does not necessarily return the newest results. If you click on the Search button in the Provenance UI and restrict the time windo

Re: Convert attribute to ascii value

2017-03-16 Thread Mark Payne
Hi Selvam, I don't believe there is anything in the Expression Language that will give you the ascii value for a character. Your best bet is probably to use an ExecuteScript processor and write a simple Groovy (or python or whatever you're most comfortable with) script to do this. Thanks -Mark

Re: Upgrading NiFi, preserving process groups

2017-03-15 Thread Mark Payne
Jim, Yes - if you copy over the flow.xml.gz then that's all that you should need to do in order to preserve your flow. If you are running NiFi 0.x though you will also want to copy over your conf/templates directory. In the 1.x baseline, Templates are included in the flow.xml also. Thanks -Mark

Re: Data Provenance Question

2017-03-07 Thread Mark Payne
Hey Frank, So the way that the provenance repository works, it writes out the data "inline" as the FlowFiles traverse the system. It then periodically (by default it's every 30 seconds or after writing 100 MB of Provenance data) "rolls over." When it rolls over, it begins writing data to a new f

Re: RouteText / Replcace Text to remove first line of file

2017-03-06 Thread Mark Payne
Joe, In terms of updating a processor to better handle this, we could update RouteText to do so. The idea being that if a large percentage of the time (say > 80% of the time) the FlowFile routed to one of the relationships is a single, contiguous subset of the data in the original FlowFile, the

Re: HandleHttpRequest failing

2017-02-22 Thread Mark Payne
ll paths 2. do not autoterminate failure conditions 3. DELETE the StandardHttpContextMap (to clear the log jam) 4. Recreate it fresh, which I presume creates it empty (I hope) What else must I do to recover? And how do I properly handle those "broken connection" situations? On Wed, Feb 22,

Re: HandleHttpRequest failing

2017-02-22 Thread Mark Payne
failure conditions 3. DELETE the StandardHttpContextMap (to clear the log jam) 4. Recreate it fresh, which I presume creates it empty (I hope) What else must I do to recover? And how do I properly handle those "broken connection" situations? On Wed, Feb 22, 2017 at 10:06 AM, Mark Payne mail

Re: HandleHttpRequest failing

2017-02-22 Thread Mark Payne
Jim, You likely have a path through your flow where you are receiving an HTTP Request via HandleHttpRequest but you never respond via a HandleHttpResponse. When using these processors, it's important that every incoming FlowFile go to a HandleHttpResponse processor. Do you have some path in you

Re: Returning Responses to Http Requests

2017-02-15 Thread Mark Payne
Jim, When you configure your HandleHttpRequest processor, there is a property for the HttpContextMap to use. Within the Standard Http Context Map you can configure a property named "Request Expiration". By default, it is set to 1 minute. If any request is not handled within that time limit, it

Re: Bug, Groovy log.info() does not work in version 1.1.1

2017-02-09 Thread Mark Payne
Hey Carl, In version 1.0.0, the default log level for the 'org.apache.nifi.processors' class was changed from INFO level to WARN level. This was done because there is a tremendous amount of information logged by most of the standard processors, and many users were complaining that the logs were

Re: Data extraction for 100 columns is possible in NiFi?

2017-01-30 Thread Mark Payne
Prabhu, My guess is that you probably could find some way to do this with the standard out-of-the-box processors that come with NiFi. Perhaps by using Extract Text to extract the header columns, and then using ReplaceText and perhaps a few other processors. Going down this route though is likely

Re: How can i compare hours in which data having with current datetime hours?

2017-01-23 Thread Mark Payne
Prabhu, I think the RouteText processor will give you what you need. If you set the "Matching Strategy" property to "Satisfies Expression," then it will allow you to use the Expression Language to evaluate each line of text in the file. Each line of text is available using the "line" variable. S

Re: 5 minutes

2017-01-13 Thread Mark Payne
Alessio, As Bryan mentioned, we do allow for Reporting Tasks to report these sorts of metrics and bulletins to other systems. One thing that I wanted to note, though, is that you can right-click on a Processor and view Status History. This will show you the metrics that the processor shows (and

Re: Json Validation for mandatory fields

2017-01-12 Thread Mark Payne
Selvam, You can use EvaluateJsonPath to extract the particular JSON value into an attribute and then use RouteOnAttribute to make sure that the field exists (and is valid if necessary). For example, for EvalualteJsonPath you might use something like: name: $.person.name And then in RouteOnAtt

Re: Cluster is still voting on which Flow is the correct flow for the cluster

2017-01-12 Thread Mark Payne
en you know the solution everything looks clear, but is there a way to have a clearer error message into the logs ? From: Mark Payne mailto:marka...@hotmail.com>> Sent: Thursday, January 12, 2017 3:24:22 PM To: users@nifi.apache.org<mailto:users@nifi.apache.o

Re: Cluster is still voting on which Flow is the correct flow for the cluster

2017-01-12 Thread Mark Payne
Alessio, It looks like the flow is the not the same on both nodes. Since you only have two nodes, NiFi is not able to come up with a majority of votes. Specifically, it looks like the conf/authorizations.xml or conf/authorizers.xml or conf/users.xml file is out-of-sync. You can try copying these

Re: ListFile, FetchFile Scalability

2017-01-10 Thread Mark Payne
few words about the purpose of each of the following? * file snapshot * file wali.lock * the partition[0-15] subdirectories, each of which appears to own a journal file * the journal file Where are the dates you referenced? Thank you again for your insights. On Tue, Jan 10, 2017 at 8:51 AM, Mark Payne

Re: ListFile, FetchFile Scalability

2017-01-10 Thread Mark Payne
Hi Jim, ListFile does not maintain a list of files w/ datetime stamps. Instead, it store just two timestamps: the timestamp of when a listing was last performed, and the timestamp of the newest file that it has sent out. This is done precisely because we need it to be able to scale as the input

Re: How to untar "tar.gz" file

2017-01-09 Thread Mark Payne
Hi Selvam, UnpackContent supports tar format. Gzip is a compression format that wraps the tar format. So you would want to use CompressContent set to Decompress mode with gzip as the format. This would give you just a tar file that UnpackContent can handle. Thanks -Mark Sent from my iPhone O

Re: NiFI server restart blocked

2017-01-05 Thread Mark Payne
Hey Leo, Yes, the error that you are seeing is related to NIFI-2907. However, it's really more of a warning/informational message, even though it shows an ERROR-level log message. It should not affect your instance of NiFi in any way other than creating an annoying (and concerning) error message

Re: Prioritizing Penalized Files Last

2017-01-03 Thread Mark Payne
Hey Alan, The FlowFile Queue uses an 'implicit prioritizer' that always pushes penalized FlowFiles to the bottom of the queue. If you are not seeing that behavior, then there may be a bug of some sort. If you right-click on the connection and List Queue, do you see any FlowFiles that are shown w

Re: Failing to Start NiFi 1.1.0, OverlappingFileLockException

2016-12-21 Thread Mark Payne
Hey Peter, The FlowFile repository obtains a lock to ensure that no other process is using that directory. Getting an OverlappingFileLockException means that there is actually another process that has a lock. Can you verify that no other instance of NiFi is running on the node? If possible woul

Re: error with flow.xml.gz

2016-11-29 Thread Mark Payne
Olav, You can update nifi.properties and set the "nifi.flowcontroller.autoResumeState" property to false. This will cause NiFI to start with all processors, etc. stopped. A couple of things to remember, though: 1) You'll need to change the property back to "true" or else it will continue to st

Re: Keep attributes when merging

2016-11-29 Thread Mark Payne
Giovanni, In the scenario that you laid out here, the merged FlowFile will not have a 'dt' attribute because there are conflicting values for the 'dt' attribute. As a result, the attribute is not carried through. If it is important to you that this attribute be carried through, you can set the

Re: SplitJson:GC Overhead Limit Exceeded

2016-11-17 Thread Mark Payne
Hi Mike, Certainly, I would recommend trying to change the max heap to say 2 GB and see if that gives you what you need. Looking at the code, it does look like this Processor may not be the most efficient in how it is parsing the JSON. There are libraries, for example, that provide a "Streaming

Re: Controller services visibility problem

2016-11-17 Thread Mark Payne
Hi Panos, You are correct in that Controller Services that are created in the top-right corner will not be available to Processors. These are "controller-level" services and are available only to Reporting Tasks and other Controller Services. If you want to use a Controller Service for Processo

Re: NPE MergeContent processor

2016-11-15 Thread Mark Payne
Conrad, Good news - I have been able to replicate the issue and track down the problem. I created a JIRA to address it - https://issues.apache.org/jira/browse/NIFI-3040. I have a PR up to address the issue. It looks like the problem is due to Replaying a FlowFile from Provenance and then rest

Re: DistributedMapCache

2016-11-07 Thread Mark Payne
Yari, I have implemented a couple of additional implementations of DistributedMapCacheClient - one for MySQL and one for Memcached. However, I'd not yet gotten them into Apache, as they need some cleanup and some refactoring probably. Eventually I need to get that migrated over. The design of D

Re: How to increase the processing speed of the ExtractText and ReplaceText Processor?

2016-10-18 Thread Mark Payne
are available on this machine? > Only single cpu are available in this machine with > core i5 processor CPU @2.20Ghz. > > ==> Are these the only processors in your flow, or do you have other > dataflows going on in the > same instance as N

Re: How to increase the processing speed of the ExtractText and ReplaceText Processor?

2016-10-17 Thread Mark Payne
Prabhu, Certainly, the performance that you are seeing, taking 4-5 hours to move 3M rows into SQLServer is far from ideal, but the good news is that it is also far from typical. You should be able to see far better results. To help us understand what is limiting the performance, and to make sur

Re: Nifi hardware recommendation

2016-10-14 Thread Mark Payne
Hi Ali, Typically, we see people using a 4-8 GB heap with NiFi. 8 GB is pretty typical for a flow that is expected to have pretty high throughput in terms of the number of FlowFiles, or a large number of processors. However, one thing that you will want to consider in terms of RAM is disk cachin

Re: Processor that Decompresses Files?

2016-10-05 Thread Mark Payne
Hi Keren, Just to clarify - what Joe mentioned here is the recommended approach of your data is GZIP'd. If you have data that is actually ZIP'd instead, you can use UnpackContent. Thanks -Mark > On Oct 5, 2016, at 9:47 AM, Joe Percivall wrote: > > Hello Keren, > > The "decompress" mode is an

Re: Configure Logging - Rolling

2016-10-05 Thread Mark Payne
Manish, I have occasionally encountered this on Windows as well. It appears to be a bug in the logging framework that we use - Logback. There are a couple of JIRAs for Logback already, related to not properly rolling over filenames. It's not clear to me yet what does or does not trigger it to

Re: flow.xm.gz not being archived to ./conf/archive in NiFi 0.6.1

2016-10-03 Thread Mark Payne
Simon, The ability to have NiFi automatically back up the flow was added in 1.0. Previous to 1.0, you have the ability to manually archive the flow, but it is not done each time that the flow is modified. Thanks -Mark > On Oct 3, 2016, at 11:06 AM, Simon Tack wrote: > > Hello, > > I am usin

Re: Remove top N lines from a text file

2016-09-28 Thread Mark Payne
Peter, Another option that may be a lot easier for you is to use the RouteText processor. If you set Matching Strategy to "Satisfies Expression", you can use the Expression Language to inspect FlowFile attributes, etc. But the RouteText processor also exposes two additional variables: _line_ a

Re: ETL processors for NiFi

2016-09-19 Thread Mark Payne
Hi Karthik, I think what you want to be using here is PutSQL, rather than ExecuteSQL. ExecuteSQL is designed to perform a SELECT statement, whereas PutSQL would update a database. PutSQL expects the incoming FlowFile to contain the SQL to execute. So you could use ReplaceText with the 'Replacem

Re: beginner question on destination failure

2016-09-15 Thread Mark Payne
Ram, You can simply create a connection from PutHBase back to itself and select the 'failure' relationship. This will cause it to stay in the flow until you are able to push to HBase again. Thanks -Mark > On Sep 15, 2016, at 3:36 PM, Nathamuni, Ramanujam wrote: > > Good Evening: > > I have

Re: Best Practice for backing up NiFi Flows

2016-09-14 Thread Mark Payne
Dan, Yes, you should be able to pre-deploy templates. When NiFi starts up, it looks in the conf/templates directory (by default - this directory can be changed in the nifi.properties file). It looks for any file that has a suffix of ".template" or ".xml" so you need to be sure that you are nami

Re: OnTrigger - FlowFile is Null

2016-09-14 Thread Mark Payne
Manish, This happens for a few reasons: * Processor has no incoming connections (is a source processor) * Processor has @ScheduleWhenEmpty annotation * Processor has more than 1 concurrent task The main reason is the third one above. If you have multiple concurrent tasks, Thread 1 can determine

Re: Processor scheduling resets on Nifi Restart

2016-08-31 Thread Mark Payne
Peter, When you say that it is scheduled to run "once a day" do you mean that you are using Timer-Driven scheduling with the scheduling period set to 24 hours? If so, then yes, that is intentional. If you want it to run only once a day, I would recommend that you use the CRON Driven Scheduling

Re: NiFI 1.0.0-BETA NullPointerException on template save containing a InvokeHTTP processor

2016-08-18 Thread Mark Payne
Porta, Thanks for reporting this. Looks like this is the issue in JIRA NIFI-2546 [1]. It has been resolved for the 1.0.0 release. Thanks! -Mark [1] https://issues.apache.org/jira/browse/NIFI-2546 > On Aug 18, 2016, at 9:56 AM, Porta Leonard w

Re: MergeContent with varying number of entries in bins.

2016-08-10 Thread Mark Payne
quot;,"file2.txt","file3.txt"]' > > Is it possible in UpdateAttribute to use the Expression Language to return > the length of this array? > > Thanks for your help, > Michael > > On Wed, Aug 10, 2016 at 11:00 AM, Mark Payne <mailto:marka...@h

Re: MergeContent with varying number of entries in bins.

2016-08-10 Thread Mark Payne
Michael, In the MergeContent processor, you can set the "Merge Strategy" to "Defragment." This will tell Merge Content to determine its bin thresholds based on the following FlowFile attributes: fragment.identifier fragment.index fragment.count So you'd need to set those 3 attributes on each of

Re: putazureeventhub error

2016-08-09 Thread Mark Payne
Hi Aaron, Can you check the logs to see if there is any more information? Thanks -Mark > On Aug 9, 2016, at 10:58 AM, Smith, Aaron wrote: > > I am trying to use the put azure event hub processor in nifi .7 and am > getting the below error: > > PutAzureEventHub[id=e6a62157-b3f8-4de0-921f-0a5

Re: Syslog timestamp

2016-08-05 Thread Mark Payne
ther project, it should be the same username and password). Thanks -Mark > On Aug 5, 2016, at 2:14 PM, Madhukar Thota wrote: > > Thanks Mark. Do i have access to jira to submit the issue? > > On Fri, Aug 5, 2016 at 1:54 PM, Mark Payne <mailto:marka...@hotmail.com>> wrote: &

Re: Syslog timestamp

2016-08-05 Thread Mark Payne
Madhukar, Currently, I don't think there's any really easy way to do this. You'd have to use a RouteOnAttribute with a regex, probably to determine which date format is being used, and then use another processor to perform the toDate() function. That being said, this is a good idea of somethin

Re: NiFi PutSFTP

2016-08-04 Thread Mark Payne
Sven, Yes, that's what it's asking for. Thanks -Mark > On Aug 4, 2016, at 1:34 PM, Sven Davison wrote: > > When it asks for the key.. is this just a regular openssh key? > > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.PutSFTP/index.html > >

Re: Nifi cluster nodes regularly stop processing any flowfiles

2016-08-03 Thread Mark Payne
alongfi...@gmail.com>> > >> wrote: > >> > Great, glad there's already a fixed bug for it! Is there anything I try > >> > to > >> > work around it for now, or at least just get longer processing times > >> > between > >>

Re: Nifi cluster nodes regularly stop processing any flowfiles

2016-08-01 Thread Mark Payne
do you agree it is the same thing by looking at the logs? > > Thanks > Joe > > On Mon, Aug 1, 2016 at 11:39 AM, Aaron Longfield wrote: >> Alright, here you go for one of the nodes! >> >> On Mon, Aug 1, 2016 at 10:33 AM, Mark Payne wrote: >>> >>>

Re: Nifi cluster nodes regularly stop processing any flowfiles

2016-08-01 Thread Mark Payne
cessing still got stuck and requiring > NiFi to be restarted on all nodes. It took longer to happen, but they went > down after a few hours. Are there any other things I can look into? > > Thanks! > > -Aaron > > On Thu, Jul 14, 2016 at 2:33 PM, Mark Payne <mailto:mar

Re: export from Teradata

2016-07-20 Thread Mark Payne
Hi Dima, That will work, however it is very dangerous to put jar files into NiFi's lib/ directory. We generally recommend only putting NAR's into the lib/ directory, as any jar that is there will be inherited by all NARs and all classloaders. So you may end up getting some really funny results f

Re: Nifi cluster nodes regularly stop processing any flowfiles

2016-07-14 Thread Mark Payne
Aaron, My guess would be that you are hitting a Full Garbage Collection. With such a huge Java heap, that will cause a "stop the world" pause for quite a long time. Which garbage collector are you using? Have you tried reducing the heap from 48 GB to say 4 or 8 GB? Thanks -Mark > On Jul 14, 2

Re: Large queues in Output Port

2016-07-08 Thread Mark Payne
Hi Kevin, Do you have backpressure configured on the connection out of the Output Port? I would guess that it's not your Output Port that is having problems but rather that the Processor downstream from the Output Port is not keeping up, which results in the FlowFiles queuing up there. Thanks -

Re: RouteText questions (regex, grouping, performance)

2016-07-07 Thread Mark Payne
come out already splitted > > Thanks! > Stephane > > On Thu, Jul 7, 2016 at 1:31 AM Mark Payne <mailto:marka...@hotmail.com>> wrote: > Stephane, > > So the Processors that you mention there mostly would require that you split > your data up into one-line chunk

Re: RouteText questions (regex, grouping, performance)

2016-07-06 Thread Mark Payne
a lot of data coming in (1000 udp packets a second), and yes, the > provenance database has been cramming because we have 6 processors dealing > with this flow before the data exits NiFi. Are there any optimization I could > deal with out of the box? > > Thanks, > Stephane

Re: GetDynamoDB Processor returns only item attribute from DynamoDB but not all attributes.

2016-07-05 Thread Mark Payne
Hi Mike, Personally, I'm not familiar with DynamoDB either, but a quick look at the source & configuration do indeed show that it expects the user to fetch only a single 'attribute' from a Dynamo Item. Looking at the AWS javadocs, it appears that this is the typical usage, as they don't seem to

Re: Multiple JSON fields validation with EvaluateJsonPath

2016-07-01 Thread Mark Payne
hub repo when done? > > Regards, > Angel > > > On Fri, Jul 1, 2016 at 4:43 PM, Mark Payne <mailto:marka...@hotmail.com>> wrote: > Angel, > > OK, sounds like we've got a good path forward. I created a JIRA for this [1]. > > Is this by chance something that

Re: Multiple JSON fields validation with EvaluateJsonPath

2016-07-01 Thread Mark Payne
> Best Regards, > Angel > > > On Fri, Jul 1, 2016 at 4:01 PM, Mark Payne <mailto:marka...@hotmail.com>> wrote: > Ah, I did miss that. In the cases that I have used it, an empty string would > not be a valid value in the JSON, so I > have been ab

Re: Multiple JSON fields validation with EvaluateJsonPath

2016-07-01 Thread Mark Payne
aluateJsonPath always sets the attribute with empty string value, > regardless if the JSON field exists or not. It is only possible to check if > the field has a empty value or not, but not possible to check if it actually > exists. Am I missing something? > > Best Regards, >

Re: RouteText questions (regex, grouping, performance)

2016-07-01 Thread Mark Payne
Hi Stephane, For #1, when you say that you get as many output as lines of text, are you sending in FlowFiles that are only one line of text each? The Processor does not aggregate multiple FlowFiles together, so if you are sending in 1-line FlowFiles, it can only route that FlowFile in 1-line out

Re: Multiple JSON fields validation with EvaluateJsonPath

2016-06-30 Thread Mark Payne
Hi Angel, I don't know the reasoning behind only using 'unmatched' for FlowFile Content, but I am guessing it has to do with the idea that the FlowFile is not modified when routed to this relationship and the author probably wanted to avoid the ambiguity that would occur when some paths matched

Re: Java Heap Size increasing without anything happening

2016-06-30 Thread Mark Payne
Stephane, This is normal and expected. Even though there is no data flowing through your system, the framework does have some maintenance that it has to do, such as pruning old archived data from the Content Repository and Provenance Repository, checkpointing the FlowFile Repository, keeping sta

Re: Convert tweet time to different date/time format

2016-06-29 Thread Mark Payne
Igor, You can use the toDate() and format() functions: ${tweetTimestamp:toDate('EEE MMM dd HH:mm:ss Z '):format('-MM-dd HH:mm:ss')} Thanks -Mark > On Jun 29, 2016, at 3:37 PM, Igor Kravzov wrote: > > How can I convert tweet date/time stamp in format "Wed Jun 29 19:04:20 + > 20

Re: PutMail processor - how to send to multiple recipients?

2016-06-29 Thread Mark Payne
And by "e-mail names" I meant e-mail addresses. Wow. -Mark > On Jun 29, 2016, at 3:18 PM, Mark Payne wrote: > > Igor , > > You can use a comma-separated list of e-mail names. > > Thanks > -Mark > >> On Jun 29, 2016, at 3:16 PM, Igor Kravzov w

Re: PutMail processor - how to send to multiple recipients?

2016-06-29 Thread Mark Payne
Igor , You can use a comma-separated list of e-mail names. Thanks -Mark > On Jun 29, 2016, at 3:16 PM, Igor Kravzov wrote: > > Guys, > > What to put in To property to send to multiple recipients? > > Thanks in advance.

Re: Custom Controller Service

2016-06-17 Thread Mark Payne
o be on controller > services and let the user schedule how often that should get invoked > and just never let them set it to 0 in the case of a controller > service. > > On Fri, Jun 17, 2016 at 3:25 PM, Mark Payne wrote: >> Hi Kumiko, >> >> I wo

Re: Custom Controller Service

2016-06-17 Thread Mark Payne
Hi Kumiko, I would recommend that in your OnEnabled method that you just create a ScheduledExecutorService and schedule the task to occur every 24 hours or however you'd like. Then, in your OnDisabled method call shutdown on that ScheduledExecutorService. A word to the wise, though - NiFi tends

Re: Scheduling using CRON driven on Windows OS

2016-06-16 Thread Mark Payne
ache.org/jira/browse/NIFI-401> [2] https://github.com/apache/nifi/pull/512 <https://github.com/apache/nifi/pull/512> > On Jun 16, 2016, at 6:25 PM, Keith Lim wrote: > > Thanks Mark. I appreciate the effort and the quality work that you guys are > doing. > > Th

Re: Scheduling using CRON driven on Windows OS

2016-06-16 Thread Mark Payne
Keith, I believe there already is a PR for this. I did an initial review and things looked good but it touches some very critical parts of the application and needs to be scrutinized and reviewed much more thoroughly before being merged in. Thanks -Mark Sent from my iPhone > On Jun 16, 2016

Re: How to effectively log the data flow in NiFi?

2016-06-13 Thread Mark Payne
at (an as yet unchanged from original > download) conf/logback.xml in 0.6.1 and I'm seeing: > > - no string of characters "org.apache.nifi.processors" and > - INFO all over in places like level="INFO"/> > > What am I missing? Was your comment valid

Re: How to use ListenHTTP processor?

2016-06-10 Thread Mark Payne
Huagen, That property allows you to specify a Regular Expression that it will match against all HTTP Header names. If a header name matches the regex, an attribute will be added to the FlowFile with that name and value. For example, if your HTTP Headers look like: Content-Type: application/json

Re: How to effectively log the data flow in NiFi?

2016-06-10 Thread Mark Payne
Hi Huagen, This is typically the type of logging you will see in NiFi. Each processor will generally log at an INFO level what it is doing for each FlowFile. Unfortunately, though, this can become extremely verbose, and many people want that logging toned down, so in the master branch of NiFi,

Re: nifi memory question

2016-06-07 Thread Mark Payne
ush those things out of the heap for large sessions like that. > You should be able to use it like you did and see no perf implications. That > we must fix. > > Mark Payne...do you recall the jira number to tackle this? > > Thanks > Joe > > On Jun 7, 2016 6:23

Re: Controls order of execution in a queue?

2016-06-05 Thread Mark Payne
Huagen, NiFi provides the concept of Prioritizers to determine how data is Prioritized / sorted. You can right-click on a connection and click Configure. From the Settings tab, you can drag and drop to select which prioritizers are used and in what order. For instance, you can drag the First In

Re: getFile Content as json element

2016-06-02 Thread Mark Payne
nd line of file > ] > } > > > { > filename: "${filename}", > fileTime: ${now()}, > content: [ >third line of file > ] > } > > > On Wed, Jun 1, 2016 at 2:02 PM, Mark Payne <mailto:marka...@hotmail.com>> wrote: &

Re: Spark or custom processor?

2016-06-02 Thread Mark Payne
Simple Event > Processing. I say ‘probably’ because I am bringing in some other data to > compare the data against (bad domains and maybe others), but certainly isn’t > doing anything clever at the moment in terms of windowing/ aggregation with > previously seen data etc. >

Re: Spark or custom processor?

2016-06-02 Thread Mark Payne
Conrad, Typically, the way that we like to think about using NiFi vs. something like Spark or Storm is whether the processing is Simple Event Processing or Complex Event Processing. Simple Event Processing encapsulates those tasks where you are able to operate on a single piece of data by itsel

Re: getFile Content as json element

2016-06-01 Thread Mark Payne
Sven, Have you had a look at the ReplaceText processor? You could use the Regular Expression (.+) to match the entire content of the FlowFile and then replace it with something like: { filename: "${filename}", fileTime: ${now()}, content: [ $1 ] } The $1 is a back-reference that wil

Re: OutOfMemoryError from ListSFTP

2016-06-01 Thread Mark Payne
expensive. We definitely should find a > way to sort that out better if my memory on this is correct. > > Mark Payne; Can you confirm or correct? > > Thanks > Joe > > On Wed, Jun 1, 2016 at 9:24 AM, Huagen peng wrote: >> Hi, >> >> I tried to use th

Re: MergeContent questions

2016-05-31 Thread Mark Payne
Igor, MergeContent will consider a 'bin' full when any one of those conditions hit. I.e., if you set: Max Group Size = 64 MB Max Number of Entries = 100 Max Bin Age = 5 mins Then you will get a merged bin whenever a bin hits 64 MB, regardless of how long its been or how many entires there are.

Re: site to site UnknownHostException

2016-05-31 Thread Mark Payne
Hi Joe, On the sending side, do you have gbrdc00015n01 in your hosts file? Can you ping that node? Is the node that you are trying to send to a NiFi cluster or a standalone instance? I have often seen firewalls cause UnknownHostException to get thrown, as well, when using site-to-site, so pleas

Re: Cluster setup - NCM error

2016-05-19 Thread Mark Payne
I am actually currently re-working some things on 'master' and ran into this exact error message a few minutes ago. For me, the problem was due to the fact that I didn't have the "nifi.cluster.manager.protocol.port" property set. Can you verify that property is set? Thanks -Mark > On May 19,

Re: Use Case...Please help

2016-05-16 Thread Mark Payne
e and can directly pull the logs from repository. > > Please confirm. > > Thanks, > Deepak > > From: Mark Payne [mailto:marka...@hotmail.com] > Sent: Monday, May 16, 2016 5:06 PM > To: users@nifi.apache.org > Subject: Re: Use Case...Please help > > De

Re: Use Case...Please help

2016-05-16 Thread Mark Payne
ry each day it should pull up that > too. > 3- It should not delete any logs from the source repository. > 4- It should copy specified logs in one directory and xml in other > directory in HDFS. > > In such a case we can remove the concept of script. > > Hoping

Re: Use Case...Please help

2016-05-15 Thread Mark Payne
Hi Deepak, Certainly, this is something that you could use NiFi for. We often see people using NiFi to sync data from a directory on local disk to a directory in HDFS. This is typically accomplished by using a flow like: ListFile -> FetchFile -> PutHDFS You can then create a file in the source

Re: Assign FlowFile (split text) to an attribute - which processor to use?

2016-05-11 Thread Mark Payne
Igor, You can use the ExtractText processor instead of UpdateAttribute. It allows you to use a Regular Expression to match against the content of a FlowFile and create an attribute from the Capturing Group. So you could add a property to the processor named "myAttribute" with a value of "(.+)" a

Re: Build error for the custom processor against 0.7.0-SNAPSHOT

2016-05-10 Thread Mark Payne
We currently have the 'master' branch at 1.0.0-SNAPSHOT and a support branch at 0.7.0-SNAPSHOT. However, unless I am mistaken, I do not believe we publish SNAPSHOT jars to the remote Maven repository. Is it possible that you cleared your local maven repository and that's why you can't use the ar

Re: How to convert data in csv file into json data in nifi

2016-05-09 Thread Mark Payne
Venkatesh, Right now, there is no direct way to go from CSV to JSON. You can however convert CSV to Avro with the ConvertCSVToAvro processor and then go from Avro to JSON via the ConvertAvroToJSON. The ConvertCSVToAvro Processor will require an Avro schema, but this can be automatically detecte

Re: Expression Language 'getOrDefault'

2016-05-03 Thread Mark Payne
> On Tue, May 3, 2016 at 1:22 PM, Mark Payne <mailto:marka...@hotmail.com>> wrote: > Hi Devin, > > I think the 'replaceNull' function is what you are looking for. For example: > > ${ greeting:replaceNull('hello'):length():gt(5) } > > You c

Re: Expression Language 'getOrDefault'

2016-05-03 Thread Mark Payne
Hi Devin, I think the 'replaceNull' function is what you are looking for. For example: ${ greeting:replaceNull('hello'):length():gt(5) } You can find the documentation for this function at http://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#replacenull

Re: NiFi Clustering Issue

2016-04-29 Thread Mark Payne
Chris, Under the hood, NiFi is using an embedded Jetty Server to handle HTTP Requests. The documentation in the web server indicates: The network interface this connector binds to as an IP address or a hostname. If null or 0.0.0.0, then bind to all interfaces. So if not specified, it should b

Re: Apache NiFi with Fail2Ban

2016-04-27 Thread Mark Payne
Stephane, You should be able to uninstall the nifi service manually by removing the following files: /etc/rc2.d/S65nifi /etc/init.d/nifi /etc/rc2.d/K65nifi Thanks -Mark > On Apr 27, 2016, at 5:26 AM, Stéphane Maarek > wrote: > > Hi, > > I think I have messed something up and I need some h

Re: Connecting a Reporting Task to a Processor

2016-04-27 Thread Mark Payne
Hi Brett, Reporting Tasks are not designed to send data specifically to Processors, but rather to send data to 'some arbitrary place'. In the case of MonitorDiskUsage, it is simply logging the information. So I don't think that connecting the reporting task to a processor is the right route to

Re: Default termination for relationships

2016-04-25 Thread Mark Payne
Manish, I think changing the default to auto terminate would be a rather dangerous move. When a user is creating a dataflow, it would be very easy to overlook one of the relationships on a Processor and forget to configure it. If it were configured to auto-terminate, the data that is routed to

Re: Need help understanding backpressure

2016-04-20 Thread Mark Payne
793> > On Apr 19, 2016, at 11:51 AM, McDermott, Chris Kevin (MSDU - > STaTS/StorefrontRemote) wrote: > > Not exactly what I was thinking, but this is better! > > Thanks, > Chris > > > > > On 4/19/16, 8:44 AM, "Mark Payne" wrote: > >>

Re: Need help understanding backpressure

2016-04-19 Thread Mark Payne
gt; > Chris > > > > > On 4/15/16, 1:12 PM, "Mark Payne" wrote: > >> Chris, >> >> When you apply backpressure to that connection, it will cause the processor >> that is >> the source of the connection to stop being scheduled to

Re: Need help understanding backpressure

2016-04-15 Thread Mark Payne
Chris, When you apply backpressure to that connection, it will cause the processor that is the source of the connection to stop being scheduled to run until the queue clears out. However, as you noted, data will still queue up in that processor's incoming connections. So to force backpressure t

<    1   2   3   4   5   6   7   >