Re: "External" extensions
In nodejs/npm world, each module has package.json, which declaratively indicate which node version and other external modules it depends on. Similarly I am thinking Nar modules can declare which version of JVM and NiFi it depends on and also which other modules it depends on. NPM ( NiFi Package Manager) can warn users if the module they are trying to Install doesn't match their runtime. -Sumo Sent from my iPad > On Nov 1, 2015, at 2:14 PM, Oleg Zhurakousky> wrote: > > Well the question still remains unanswered, what relationship those projects > have to ASF distribution of NiFi? I seriously doubt that anyone on this list > suggests that all have to be part of the release. And if they are not then > they are just individual projects managed in/out of ASF, right? > > Sent from my iPhone > >> On Nov 1, 2015, at 17:10, Adam Estrada wrote: >> >> The elasticsearch project has a really cool plugin utility that >> automatically downloads and builds plugins from GitHub, BitBucket, etc... >> >> Has anyone taken a look at that? >> >> A >> >> Sent from my iPhone >> >>> On Nov 1, 2015, at 3:54 PM, Benson Margulies wrote: >>> >>> ASF policy; a PMC should not be in the business of creating and >>> maintaining code 'somewhere else' and/or under another license, for >>> fear of confusion. >>> >>> Gray area -- some PMC members can be in that business, as long as the >>> boundary is clear. >>> >>> There was a thing called 'apache extras' for this. Unfortunately, it >>> was hosted as part of google code, which is defunct. As far as I know, >>> various plans to replace it have not come to fruition, but I might be >>> behind. >>> >>> >>> >>> >>> On Sun, Nov 1, 2015 at 3:04 PM, wrote: How about maintaining a registry like npm https://www.npmjs.com or https://github.com/jspm/registry where individuals host their modules on github and users can discover them via registry? Sent from my iPad > On Nov 1, 2015, at 10:35 AM, Joe Witt wrote: > > " but raises several questions, all pertaining to the relationship of > this project with ASF, its ownership and control." > > ...that is what I'm struggling to respond to as well. > > It feels like the right path within the ASF is to establish child > projects of Apache NiFi. I think we knew we needed to do this anyway > as we've mentioned before. It just might be time now... > > On Sun, Nov 1, 2015 at 1:34 PM, Oleg Zhurakousky > wrote: >> Tony, plenty of opinion but so are the questions/concerns. >> Managing it on GitHub is perfect, but raises several questions, all >> pertaining to the relationship of this project with ASF, its ownership >> and control. >> Perhaps some PMCs on the list can shed some light as to how it could be >> done? >> >> Cheers >> Oleg >> >> Sent from my iPhone >> >>> On Nov 1, 2015, at 13:08, Adam Estrada wrote: >>> >>> This has been suggested before. It's a great idea!!! I suggest creating >>> a repo on github for NiFi-Processors or something like that. There are >>> many more folks searching through GitHub than on the Apache wikis, IMO. >>> This will inevitably help spread the word... >>> >>> A >>> >>> Sent from my iPhone >>> On Nov 1, 2015, at 12:55 PM, Tony Kurc wrote: Not very strong opinions on this? > On Oct 30, 2015 10:53 AM, "Joe Witt" wrote: > > Tony, > > I completely agree we should do this. A quick github search reveals > there are some nice utilities/processors folks have built for NiFi but > for which they're not necessarily going to submit them as PRs. We > should link to these as much as possible but we should also help folks > understand these aren't 'apache' things and are not of the Apache NiFi > community directly but they are good for users and developers to know > about. > > Perhaps a wiki page linking to these is good provided we have the > above sort of disclaimer and a healthy recognition such references > will become stale... > > Thanks > Joe > >> On Fri, Oct 30, 2015 at 10:48 AM, Tony Kurc wrote: >> All, >> I wanted to start a conversation about projects that are good for >> people >> using or developing NiFi, but either can't or don't belong in the >> source >> tree. This could be due to licensing issues (for example not >> compatible > (or >> not yet determined if it is compatible (GPL [1])) with the Apache
RE: Next release?
Joe, This reminds me... are there any entry or exit criteria (from a defects perspective) established for NiFi releases? In other words, what is the criteria for determining when the code is ready for release and production use? Thanks Rick -Original Message- From: Joe Witt [mailto:joe.w...@gmail.com] Sent: Monday, November 02, 2015 8:56 AM To: dev@nifi.apache.org Subject: Re: Next release? Team...we def need to address or move a good bit of ticketage to move towards an RC. It isn't critical we do it 'now' but we should strive for 6 to 8 week release cycles in my view. We should also decouple the framework/app releases from those of processors in my view but we can kick off another thread for discussion there. Thanks Joe On Oct 29, 2015 11:50 AM, "Joe Witt"wrote: > mike - that is good to know. Look forward to seeing the ticket. If > you can put the thread dumps up that would obviously be awesome though > I recognize why that is non-trivial. > > Thanks > Joe > > On Thu, Oct 29, 2015 at 11:18 AM, Michael Moser > wrote: > > All, > > > > On an extremely busy cluster that I work with, I've noticed some > > thread starvation issues on the NCM. It manifests as the "spinning > > wheel of death" when refreshing the NiFi UI. Thread and heap dumps > > point to the WebClusterManager in the framework. I've made some > > small quick-win > changes > > that I'm testing now, but would appreciate feedback from the community. > I > > will write up a ticket shortly that explains it, but would like to > > see it in 0.4.0 if reviewers agree with the changes. > > > > Thanks, > > -- Mike > > > > > > On Thu, Oct 29, 2015 at 10:04 AM, Joe Witt wrote: > > > >> I haven't done it in a while. Am happy to take it. We need to > >> scrub > the > >> items assigned to 040 and pick our must haves ... > >> On Oct 29, 2015 9:20 AM, "Sean Busbey" wrote: > >> > >> > Hi Folks! > >> > > >> > Tomorrow marks 6 weeks since the 0.3.0 release. Any one up for > >> > starting a release candidate? > >> > > >> > -- > >> > Sean > >> > > >> >
RE: Next release?
The current process is outlined in our release guide. But the main idea is that all who wish to participate in release validation do so from the RC. Unit tests are of course run by the builds but we rely on people power to verify system level testing and that is part of that testing folks should do. We obviously can't test all the things and environments and so on with this model. The more CI we can get established the better we can do. But we have much room for improvement in validating releases. On Nov 2, 2015 10:00 AM, "Rick Braddy"wrote: > Joe, > > This reminds me... are there any entry or exit criteria (from a defects > perspective) established for NiFi releases? In other words, what is the > criteria for determining when the code is ready for release and production > use? > > Thanks > Rick > > -Original Message- > From: Joe Witt [mailto:joe.w...@gmail.com] > Sent: Monday, November 02, 2015 8:56 AM > To: dev@nifi.apache.org > Subject: Re: Next release? > > Team...we def need to address or move a good bit of ticketage to move > towards an RC. It isn't critical we do it 'now' but we should strive for 6 > to 8 week release cycles in my view. > > We should also decouple the framework/app releases from those of > processors in my view but we can kick off another thread for discussion > there. > > Thanks > Joe > On Oct 29, 2015 11:50 AM, "Joe Witt" wrote: > > > mike - that is good to know. Look forward to seeing the ticket. If > > you can put the thread dumps up that would obviously be awesome though > > I recognize why that is non-trivial. > > > > Thanks > > Joe > > > > On Thu, Oct 29, 2015 at 11:18 AM, Michael Moser > > wrote: > > > All, > > > > > > On an extremely busy cluster that I work with, I've noticed some > > > thread starvation issues on the NCM. It manifests as the "spinning > > > wheel of death" when refreshing the NiFi UI. Thread and heap dumps > > > point to the WebClusterManager in the framework. I've made some > > > small quick-win > > changes > > > that I'm testing now, but would appreciate feedback from the community. > > I > > > will write up a ticket shortly that explains it, but would like to > > > see it in 0.4.0 if reviewers agree with the changes. > > > > > > Thanks, > > > -- Mike > > > > > > > > > On Thu, Oct 29, 2015 at 10:04 AM, Joe Witt wrote: > > > > > >> I haven't done it in a while. Am happy to take it. We need to > > >> scrub > > the > > >> items assigned to 040 and pick our must haves ... > > >> On Oct 29, 2015 9:20 AM, "Sean Busbey" wrote: > > >> > > >> > Hi Folks! > > >> > > > >> > Tomorrow marks 6 weeks since the 0.3.0 release. Any one up for > > >> > starting a release candidate? > > >> > > > >> > -- > > >> > Sean > > >> > > > >> > > >
Re: LogAttribute - Sending that output to a custom logger?
Mark All fair points. Can you please point out which processor docs specifically should be better. Let's fix em..you will quickly lose that new user vibe and not notice what needs to improve as much. We need to make the new user experience awesome. Thanks Joe On Nov 2, 2015 10:08 AM, "Mark Petronic"wrote: > My primary use is for understanding Nifi. I like to direct various > processors output into both their logical next processor stage as well as > into a log attribute processor. Then I tail the Nifi app log file and watch > what happens - in real time. I do not intend to use this for long term log > retention. I agree that providence is the right choice for that. So, the > only reason I wanted to allow configuration of a custom logger was simply > to isolate all the attribute-rich logging from the normal logging because I > was primarily interested in the attribute flows as a way to (a) better > understand what a processor emits because, frankly, the documentation of > some of the processors is very sparse. So, I learn imperatively, so to > speak. I say that as a new user. I feel I should be able to get a pretty > good understanding of a processor by reading the usage. But I am finding > that the documentation, in some cases, is more like what I like to refer to > as, "note to self" documentation. Great if you are the guy who wrote the > processor with those "insights" - not so great if you are not the > developer. So, then I need to dig up the code. That should not be needed as > the first step of understanding a processor as a new user. There is some > well documented processors but not all are, IMHO. (b) Validate my flows > with some test data and verify attribute values look correct and routing is > happen on them as expected, etc. Again, easier, IMO, to see in the logs > than digging into the providence data. > > Maybe this is just a good "private" feature for me so maybe I will just > create a private version to use on my own. I already have it working but > would need more polish to achieve PR status. Maybe this is the sort of > thing that others would not find beneficial? That's fine. There are others > ways I can contribute in the future. I'm still having fun! :) > > On Sun, Nov 1, 2015 at 12:41 PM, Joe Witt wrote: > > > Mark Petronic, > > > > I share Payne's perspective on this. But I'd also like to work with > > you to better understand the workflow. For those of us that have used > > this tool for a long time there is a lot we take for granted from a > > new user perspective. We believe the provenance feature to provide a > > far superior option to understanding how an item went through the > > system and the timing and what we knew when and so on. But, it would > > be great to understand it from your perspective as someone learning > > NiFi. Not meaning to take away from your proposed contrib - that > > would be great too. Just want to see if the prov user experience > > solves what you're looking for and if not can we make it do that. > > > > Thanks > > Joe > > > > On Sun, Nov 1, 2015 at 11:23 AM, Mark Payne > wrote: > > > Mark, > > > > > > To make sure that I understand what you're proposing, you want to add a > > property to > > > LogAttribute that allows users to provide a custom logger name? > > > > > > If that is indeed what you are suggesting then I think it's a great > idea. > > > > > > That being said, in practice I rarely ever use LogAttribute and we even > > considered removing > > > it from the codebase before we open sourced, because the Data > Provenance > > provides a > > > much better view of what's going on to debug your flows. > > > > > > I know you're pretty new to NiFi, so if you've not yet had a chance to > > play with the Provenance, > > > you can see the section in the User Guide at > > > http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data-provenance > > < > > > http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data-provenance > > > > > > > > > If you're interested in updating the LogAttribute processor, though, > > we'd be happy to have > > > that contribution added, as it does make the Processor more usable. > > > > > > Thanks > > > -Mark > > > > > >> On Oct 31, 2015, at 12:35 PM, Mark Petronic > > wrote: > > >> > > >> From the code, it appears it cannot be done as the attribute logging > > >> goes the same getLogger() instance as the normal nifi-app traces. Has > > >> anyone considered making that configurable, maybe allowing you do > > >> define a different logger name for LogAttribute then creating that > > >> logger definition in log back conf allowing flexibility? I'm using > > >> attribute logging heavily as I try to better learn/debug Nifi (it give > > >> you a nice 'under the hood' view of the flow) and build up some flows > > >> and feel it would be beneficial to be able to capture the LogAttribte > > >> message by themselves for
Re: LogAttribute - Sending that output to a custom logger?
We greatly appreciate contributions. Your prescribed approach sounds great and if you are willing to give us a few cycles pointing out, and optionally correcting, the items that are in need of improvement, we will certainly incorporate. Thanks! On Mon, Nov 2, 2015 at 1:28 PM, Mark Petronicwrote: > I'm sort of in the camp of "don't come with a complaint if you don't come > with a solution" and hesitated to even raise the documentation comment > without just fixing it myself. How about this, I just do some updates on > some processor docs myself and use that as my first contribution to work > through the process of committing to this project? > > But, to give you one quick example, EvaluateJSONPath (which, btw has pretty > good docs otherwise) does not mention HOW to extract the JSON you are > interested in. I had to look at the code to figure out it used this: > https://github.com/jayway/JsonPath. Ok, that was not hard, I admit, but, > as > a user, should I need to look at the code for such information? I submit, > no. Me personally, I like to dig into the code. So, this is more a comment > on "overall goodness" for the general new user experience. > > I agree with your assessment of 'new user vibe' as I am starting to not > notice it as much. lol > > On Mon, Nov 2, 2015 at 10:15 AM, Joe Witt wrote: > > > Mark > > > > All fair points. Can you please point out which processor docs > > specifically should be better. Let's fix em..you will quickly lose that > > new user vibe and not notice what needs to improve as much. We need to > > make the new user experience awesome. > > > > Thanks > > Joe > > On Nov 2, 2015 10:08 AM, "Mark Petronic" wrote: > > > > > My primary use is for understanding Nifi. I like to direct various > > > processors output into both their logical next processor stage as well > as > > > into a log attribute processor. Then I tail the Nifi app log file and > > watch > > > what happens - in real time. I do not intend to use this for long term > > log > > > retention. I agree that providence is the right choice for that. So, > the > > > only reason I wanted to allow configuration of a custom logger was > simply > > > to isolate all the attribute-rich logging from the normal logging > > because I > > > was primarily interested in the attribute flows as a way to (a) better > > > understand what a processor emits because, frankly, the documentation > of > > > some of the processors is very sparse. So, I learn imperatively, so to > > > speak. I say that as a new user. I feel I should be able to get a > pretty > > > good understanding of a processor by reading the usage. But I am > finding > > > that the documentation, in some cases, is more like what I like to > refer > > to > > > as, "note to self" documentation. Great if you are the guy who wrote > the > > > processor with those "insights" - not so great if you are not the > > > developer. So, then I need to dig up the code. That should not be > needed > > as > > > the first step of understanding a processor as a new user. There is > some > > > well documented processors but not all are, IMHO. (b) Validate my flows > > > with some test data and verify attribute values look correct and > routing > > is > > > happen on them as expected, etc. Again, easier, IMO, to see in the logs > > > than digging into the providence data. > > > > > > Maybe this is just a good "private" feature for me so maybe I will just > > > create a private version to use on my own. I already have it working > but > > > would need more polish to achieve PR status. Maybe this is the sort of > > > thing that others would not find beneficial? That's fine. There are > > others > > > ways I can contribute in the future. I'm still having fun! :) > > > > > > On Sun, Nov 1, 2015 at 12:41 PM, Joe Witt wrote: > > > > > > > Mark Petronic, > > > > > > > > I share Payne's perspective on this. But I'd also like to work with > > > > you to better understand the workflow. For those of us that have > used > > > > this tool for a long time there is a lot we take for granted from a > > > > new user perspective. We believe the provenance feature to provide a > > > > far superior option to understanding how an item went through the > > > > system and the timing and what we knew when and so on. But, it would > > > > be great to understand it from your perspective as someone learning > > > > NiFi. Not meaning to take away from your proposed contrib - that > > > > would be great too. Just want to see if the prov user experience > > > > solves what you're looking for and if not can we make it do that. > > > > > > > > Thanks > > > > Joe > > > > > > > > On Sun, Nov 1, 2015 at 11:23 AM, Mark Payne > > > wrote: > > > > > Mark, > > > > > > > > > > To make sure that I understand what you're proposing, you want to > > add a > > > > property to > > > > > LogAttribute that allows
Re: LogAttribute - Sending that output to a custom logger?
This thread has forked into two different conversations: 1. improvements to LogAttribute processor; 2. improvements to processor documentation. 1) re: improvements to LogAttribute - we already have NIFI-67 [1] that suggests a number of improvements to LogAttribute. One of these is the use of a custom name for the logger so that logback rules can be written against that name. While the provenance engine is great for many scenarios, in my opinion, it doesn't replace the need for true text-based logging. The tooling for log processing is very mature and there's no ability to "grep" a provenance repository, migrate or offload provenance logs into deep storage, store log events into a database, or do any other cool syslogd or logback type things. Being able to capture and log a flowfile at the exact right place in the data flow and processing it using the command line is an extremely valuable tool in the toolkit. For a long time, I've wanted to work on at least some of the things mentioned in NIFI-67 and will hopefully get to do so time willing. Having a custom "name" for the LogAttribute processor seems like a no-brainer. Contributions for this should definitely be welcome! 2) improvements to processor document - I agree, even as a somewhat seasoned NIFI user, I still have a hard time reading and understanding the processor documentation. I often do exactly what Mark P. suggests and instead go directly to the source. Any contribution towards better processor documentation is greatly appreciated! [1] https://issues.apache.org/jira/browse/NIFI-67 On Mon, Nov 2, 2015 at 1:54 PM, Aldrin Piriwrote: > We greatly appreciate contributions. Your prescribed approach sounds great > and if you are willing to give us a few cycles pointing out, and optionally > correcting, the items that are in need of improvement, we will certainly > incorporate. > > Thanks! > > On Mon, Nov 2, 2015 at 1:28 PM, Mark Petronic > wrote: > > > I'm sort of in the camp of "don't come with a complaint if you don't come > > with a solution" and hesitated to even raise the documentation comment > > without just fixing it myself. How about this, I just do some updates on > > some processor docs myself and use that as my first contribution to work > > through the process of committing to this project? > > > > But, to give you one quick example, EvaluateJSONPath (which, btw has > pretty > > good docs otherwise) does not mention HOW to extract the JSON you are > > interested in. I had to look at the code to figure out it used this: > > https://github.com/jayway/JsonPath. Ok, that was not hard, I admit, but, > > as > > a user, should I need to look at the code for such information? I submit, > > no. Me personally, I like to dig into the code. So, this is more a > comment > > on "overall goodness" for the general new user experience. > > > > I agree with your assessment of 'new user vibe' as I am starting to not > > notice it as much. lol > > > > On Mon, Nov 2, 2015 at 10:15 AM, Joe Witt wrote: > > > > > Mark > > > > > > All fair points. Can you please point out which processor docs > > > specifically should be better. Let's fix em..you will quickly lose > that > > > new user vibe and not notice what needs to improve as much. We need to > > > make the new user experience awesome. > > > > > > Thanks > > > Joe > > > On Nov 2, 2015 10:08 AM, "Mark Petronic" > wrote: > > > > > > > My primary use is for understanding Nifi. I like to direct various > > > > processors output into both their logical next processor stage as > well > > as > > > > into a log attribute processor. Then I tail the Nifi app log file and > > > watch > > > > what happens - in real time. I do not intend to use this for long > term > > > log > > > > retention. I agree that providence is the right choice for that. So, > > the > > > > only reason I wanted to allow configuration of a custom logger was > > simply > > > > to isolate all the attribute-rich logging from the normal logging > > > because I > > > > was primarily interested in the attribute flows as a way to (a) > better > > > > understand what a processor emits because, frankly, the documentation > > of > > > > some of the processors is very sparse. So, I learn imperatively, so > to > > > > speak. I say that as a new user. I feel I should be able to get a > > pretty > > > > good understanding of a processor by reading the usage. But I am > > finding > > > > that the documentation, in some cases, is more like what I like to > > refer > > > to > > > > as, "note to self" documentation. Great if you are the guy who wrote > > the > > > > processor with those "insights" - not so great if you are not the > > > > developer. So, then I need to dig up the code. That should not be > > needed > > > as > > > > the first step of understanding a processor as a new user. There is > > some > > > > well documented processors but not all
[GitHub] nifi pull request: NIFI-1051 Allowed FileSystemRepository to skip ...
Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/111 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: LogAttribute - Sending that output to a custom logger?
Hi Where I work we have created an attribute loggers of our own. It is a fairly simple affair which used a regex to determine which attributes to log, and writes them as key value pairs to a file, whose location is determined by a user properly. I'm happy to put this out there if anyone is interested. Sent from my HTC - Reply message - From: "Adam Taft"Date: Mon, Nov 2, 2015 19:23 Subject: LogAttribute - Sending that output to a custom logger? To: This thread has forked into two different conversations: 1. improvements to LogAttribute processor; 2. improvements to processor documentation. 1) re: improvements to LogAttribute - we already have NIFI-67 [1] that suggests a number of improvements to LogAttribute. One of these is the use of a custom name for the logger so that logback rules can be written against that name. While the provenance engine is great for many scenarios, in my opinion, it doesn't replace the need for true text-based logging. The tooling for log processing is very mature and there's no ability to "grep" a provenance repository, migrate or offload provenance logs into deep storage, store log events into a database, or do any other cool syslogd or logback type things. Being able to capture and log a flowfile at the exact right place in the data flow and processing it using the command line is an extremely valuable tool in the toolkit. For a long time, I've wanted to work on at least some of the things mentioned in NIFI-67 and will hopefully get to do so time willing. Having a custom "name" for the LogAttribute processor seems like a no-brainer. Contributions for this should definitely be welcome! 2) improvements to processor document - I agree, even as a somewhat seasoned NIFI user, I still have a hard time reading and understanding the processor documentation. I often do exactly what Mark P. suggests and instead go directly to the source. Any contribution towards better processor documentation is greatly appreciated! [1] https://issues.apache.org/jira/browse/NIFI-67 On Mon, Nov 2, 2015 at 1:54 PM, Aldrin Piri wrote: > We greatly appreciate contributions. Your prescribed approach sounds great > and if you are willing to give us a few cycles pointing out, and optionally > correcting, the items that are in need of improvement, we will certainly > incorporate. > > Thanks! > > On Mon, Nov 2, 2015 at 1:28 PM, Mark Petronic > wrote: > > > I'm sort of in the camp of "don't come with a complaint if you don't come > > with a solution" and hesitated to even raise the documentation comment > > without just fixing it myself. How about this, I just do some updates on > > some processor docs myself and use that as my first contribution to work > > through the process of committing to this project? > > > > But, to give you one quick example, EvaluateJSONPath (which, btw has > pretty > > good docs otherwise) does not mention HOW to extract the JSON you are > > interested in. I had to look at the code to figure out it used this: > > https://github.com/jayway/JsonPath. Ok, that was not hard, I admit, but, > > as > > a user, should I need to look at the code for such information? I submit, > > no. Me personally, I like to dig into the code. So, this is more a > comment > > on "overall goodness" for the general new user experience. > > > > I agree with your assessment of 'new user vibe' as I am starting to not > > notice it as much. lol > > > > On Mon, Nov 2, 2015 at 10:15 AM, Joe Witt wrote:
Re: Common data exchange formats and tabular data
Hello all, I am new to the NiFi community but I have a good amount of experience with ETL tools and applications that process lots of tabular data. In my experience, JSON is only useful as the common format for tabular data if it has a "flat" schema, in which case there aren't any advantages for JSON over other formats such as CSV. However, I've seen lots of "CSV" files that don't seem to adhere to any standard, so I would presume NiFi would need a rigid schema such as RFC-4180 (http://www.rfc-base.org/txt/rfc-4180.txt). However CSV isn't a natural way to express the schema of the rows, so JSON or YAML is probably a better choice. There's a format called Tabular Data Package that combines CSV and JSON for tabular data serialization: http://dataprotocols.org/tabular-data-package/ Avro is similar, but the schema must always be provided with the data. In the case of NiFi DataFlows, it's likely more efficient to send the schema once as an initialization packet (I can't remember the real term in NiFi), then the rows can be streamed individually, in batches of user-defined size, sampled, etc. Having said all that, there are projects like Apache Drill that can handle non-flat JSON files and still present them in tabular format. They have functions like KVGEN and FLATTEN to transform the document(s) into tabular format. In the use cases you present below, you already know the data is tabular and as such, the extra data model transformation is not needed. If this is desired, it should be apparent that a Streaming JSON processor would be necessary; otherwise, for large tabular datasets you'd have to read the whole JSON file into memory to parse individual rows. Regards, Matt From: Toivo AdamsReply-To: Date: Monday, November 2, 2015 at 5:12 AM To: Subject: Common data exchange formats and tabular data All, Some processors get/put data in tabular form. (PutSQL, ExecuteSQL, soon Cassandra) It would be very nice to be able use such processors in pipeline previous processor output is next processor input. To achieve this, processors should use common data exchange format. JSON is most widely used, it¹s simple and readable. But JSON lacks schema. Schema can be very useful to automate data insert/update. Avro has schema, but is somewhat more complicated and not widely used (yet?). Please see also: https://issues.apache.org/jira/browse/NIFI-978 https://issues.apache.org/jira/browse/NIFI-901 Opinions? Thanks Toivo -- View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Common-data-exchange-f ormats-and-tabular-data-tp3508.html Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
Re: LogAttribute - Sending that output to a custom logger?
My primary use is for understanding Nifi. I like to direct various processors output into both their logical next processor stage as well as into a log attribute processor. Then I tail the Nifi app log file and watch what happens - in real time. I do not intend to use this for long term log retention. I agree that providence is the right choice for that. So, the only reason I wanted to allow configuration of a custom logger was simply to isolate all the attribute-rich logging from the normal logging because I was primarily interested in the attribute flows as a way to (a) better understand what a processor emits because, frankly, the documentation of some of the processors is very sparse. So, I learn imperatively, so to speak. I say that as a new user. I feel I should be able to get a pretty good understanding of a processor by reading the usage. But I am finding that the documentation, in some cases, is more like what I like to refer to as, "note to self" documentation. Great if you are the guy who wrote the processor with those "insights" - not so great if you are not the developer. So, then I need to dig up the code. That should not be needed as the first step of understanding a processor as a new user. There is some well documented processors but not all are, IMHO. (b) Validate my flows with some test data and verify attribute values look correct and routing is happen on them as expected, etc. Again, easier, IMO, to see in the logs than digging into the providence data. Maybe this is just a good "private" feature for me so maybe I will just create a private version to use on my own. I already have it working but would need more polish to achieve PR status. Maybe this is the sort of thing that others would not find beneficial? That's fine. There are others ways I can contribute in the future. I'm still having fun! :) On Sun, Nov 1, 2015 at 12:41 PM, Joe Wittwrote: > Mark Petronic, > > I share Payne's perspective on this. But I'd also like to work with > you to better understand the workflow. For those of us that have used > this tool for a long time there is a lot we take for granted from a > new user perspective. We believe the provenance feature to provide a > far superior option to understanding how an item went through the > system and the timing and what we knew when and so on. But, it would > be great to understand it from your perspective as someone learning > NiFi. Not meaning to take away from your proposed contrib - that > would be great too. Just want to see if the prov user experience > solves what you're looking for and if not can we make it do that. > > Thanks > Joe > > On Sun, Nov 1, 2015 at 11:23 AM, Mark Payne wrote: > > Mark, > > > > To make sure that I understand what you're proposing, you want to add a > property to > > LogAttribute that allows users to provide a custom logger name? > > > > If that is indeed what you are suggesting then I think it's a great idea. > > > > That being said, in practice I rarely ever use LogAttribute and we even > considered removing > > it from the codebase before we open sourced, because the Data Provenance > provides a > > much better view of what's going on to debug your flows. > > > > I know you're pretty new to NiFi, so if you've not yet had a chance to > play with the Provenance, > > you can see the section in the User Guide at > http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data-provenance > < > http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data-provenance > > > > > > If you're interested in updating the LogAttribute processor, though, > we'd be happy to have > > that contribution added, as it does make the Processor more usable. > > > > Thanks > > -Mark > > > >> On Oct 31, 2015, at 12:35 PM, Mark Petronic > wrote: > >> > >> From the code, it appears it cannot be done as the attribute logging > >> goes the same getLogger() instance as the normal nifi-app traces. Has > >> anyone considered making that configurable, maybe allowing you do > >> define a different logger name for LogAttribute then creating that > >> logger definition in log back conf allowing flexibility? I'm using > >> attribute logging heavily as I try to better learn/debug Nifi (it give > >> you a nice 'under the hood' view of the flow) and build up some flows > >> and feel it would be beneficial to be able to capture the LogAttribte > >> message by themselves for more clarity on what is happening. I would > >> not mind maybe trying to implement this feature as my first crack at > >> contributing to the project. Seems like a fairly easy one that would > >> allow me to "go through the motions" of a full pull request process > >> and iron out the process. Anyone have any thoughts on this? > > >
Re: LogAttribute - Sending that output to a custom logger?
David, This sounds like a slightly different use case than the NiFi standard LogAttribute processor. It sounds like your processor is more of a generic attribute converter and file writer. The LogAttribute processor is designed to interact with the underlying NiFi logging subsystem, not necessarily just to write files. That being said, your processor may be a useful contribution to Apache NiFi. Specifically, the value-add of your processor might be in the key-value format you've defined to output the flowfile attributes. It might be interesting to see this expressed as an attribute-to-payload converter, chained together with potentially other processors like PutFile in the dataflow. If you want to contribute your processor, I would recommend making it available on GitHub (or similar) for review by the Apache NiFi community. Just post a link of your contribution here or even issue a pull request for your processor. It would at least be evaluated and considered for inclusion. Hope this helps. Adam On Mon, Nov 2, 2015 at 5:39 PM, davidrsm...@btinternet.com < davidrsm...@btinternet.com> wrote: > Hi > > Where I work we have created an attribute loggers of our own. It is a > fairly simple affair which used a regex to determine which attributes to > log, and writes them as key value pairs to a file, whose location is > determined by a user properly. I'm happy to put this out there if anyone is > interested. > > Sent from my HTC > > > - Reply message - > From: "Adam Taft"> Date: Mon, Nov 2, 2015 19:23 > Subject: LogAttribute - Sending that output to a custom logger? > To: > > This thread has forked into two different conversations: 1. improvements > to LogAttribute processor; 2. improvements to processor documentation. > > 1) re: improvements to LogAttribute - we already have NIFI-67 [1] that > suggests a number of improvements to LogAttribute. One of these is the use > of a custom name for the logger so that logback rules can be written > against that name. > > While the provenance engine is great for many scenarios, in my opinion, it > doesn't replace the need for true text-based logging. The tooling for log > processing is very mature and there's no ability to "grep" a provenance > repository, migrate or offload provenance logs into deep storage, store log > events into a database, or do any other cool syslogd or logback type > things. Being able to capture and log a flowfile at the exact right place > in the data flow and processing it using the command line is an extremely > valuable tool in the toolkit. > > For a long time, I've wanted to work on at least some of the things > mentioned in NIFI-67 and will hopefully get to do so time willing. Having > a custom "name" for the LogAttribute processor seems like a no-brainer. > Contributions for this should definitely be welcome! > > 2) improvements to processor document - I agree, even as a somewhat > seasoned NIFI user, I still have a hard time reading and understanding the > processor documentation. I often do exactly what Mark P. suggests and > instead go directly to the source. Any contribution towards better > processor documentation is greatly appreciated! > > [1] https://issues.apache.org/jira/browse/NIFI-67 > > > On Mon, Nov 2, 2015 at 1:54 PM, Aldrin Piri wrote: > > > We greatly appreciate contributions. Your prescribed approach sounds > great > > and if you are willing to give us a few cycles pointing out, and > optionally > > correcting, the items that are in need of improvement, we will certainly > > incorporate. > > > > Thanks! > > > > On Mon, Nov 2, 2015 at 1:28 PM, Mark Petronic > > wrote: > > > > > I'm sort of in the camp of "don't come with a complaint if you don't > come > > > with a solution" and hesitated to even raise the documentation comment > > > without just fixing it myself. How about this, I just do some updates > on > > > some processor docs myself and use that as my first contribution to > work > > > through the process of committing to this project? > > > > > > But, to give you one quick example, EvaluateJSONPath (which, btw has > > pretty > > > good docs otherwise) does not mention HOW to extract the JSON you are > > > interested in. I had to look at the code to figure out it used this: > > > https://github.com/jayway/JsonPath. Ok, that was not hard, I admit, > but, > > > as > > > a user, should I need to look at the code for such information? I > submit, > > > no. Me personally, I like to dig into the code. So, this is more a > > comment > > > on "overall goodness" for the general new user experience. > > > > > > I agree with your assessment of 'new user vibe' as I am starting to not > > > notice it as much. lol > > > > > > On Mon, Nov 2, 2015 at 10:15 AM, Joe Witt wrote: > > >
Re: LogAttribute - Sending that output to a custom logger?
https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide On Mon, Nov 2, 2015 at 9:04 PM, Adam Taftwrote: > David, > > This sounds like a slightly different use case than the NiFi standard > LogAttribute processor. It sounds like your processor is more of a generic > attribute converter and file writer. The LogAttribute processor is > designed to interact with the underlying NiFi logging subsystem, not > necessarily just to write files. > > That being said, your processor may be a useful contribution to Apache > NiFi. Specifically, the value-add of your processor might be in the > key-value format you've defined to output the flowfile attributes. It > might be interesting to see this expressed as an attribute-to-payload > converter, chained together with potentially other processors like PutFile > in the dataflow. > > If you want to contribute your processor, I would recommend making it > available on GitHub (or similar) for review by the Apache NiFi community. > Just post a link of your contribution here or even issue a pull request for > your processor. It would at least be evaluated and considered for > inclusion. > > Hope this helps. > > Adam > > > On Mon, Nov 2, 2015 at 5:39 PM, davidrsm...@btinternet.com < > davidrsm...@btinternet.com> wrote: > >> Hi >> >> Where I work we have created an attribute loggers of our own. It is a >> fairly simple affair which used a regex to determine which attributes to >> log, and writes them as key value pairs to a file, whose location is >> determined by a user properly. I'm happy to put this out there if anyone is >> interested. >> >> Sent from my HTC >> >> >> - Reply message - >> From: "Adam Taft" >> Date: Mon, Nov 2, 2015 19:23 >> Subject: LogAttribute - Sending that output to a custom logger? >> To: >> >> This thread has forked into two different conversations: 1. improvements >> to LogAttribute processor; 2. improvements to processor documentation. >> >> 1) re: improvements to LogAttribute - we already have NIFI-67 [1] that >> suggests a number of improvements to LogAttribute. One of these is the use >> of a custom name for the logger so that logback rules can be written >> against that name. >> >> While the provenance engine is great for many scenarios, in my opinion, it >> doesn't replace the need for true text-based logging. The tooling for log >> processing is very mature and there's no ability to "grep" a provenance >> repository, migrate or offload provenance logs into deep storage, store log >> events into a database, or do any other cool syslogd or logback type >> things. Being able to capture and log a flowfile at the exact right place >> in the data flow and processing it using the command line is an extremely >> valuable tool in the toolkit. >> >> For a long time, I've wanted to work on at least some of the things >> mentioned in NIFI-67 and will hopefully get to do so time willing. Having >> a custom "name" for the LogAttribute processor seems like a no-brainer. >> Contributions for this should definitely be welcome! >> >> 2) improvements to processor document - I agree, even as a somewhat >> seasoned NIFI user, I still have a hard time reading and understanding the >> processor documentation. I often do exactly what Mark P. suggests and >> instead go directly to the source. Any contribution towards better >> processor documentation is greatly appreciated! >> >> [1] https://issues.apache.org/jira/browse/NIFI-67 >> >> >> On Mon, Nov 2, 2015 at 1:54 PM, Aldrin Piri wrote: >> >> > We greatly appreciate contributions. Your prescribed approach sounds >> great >> > and if you are willing to give us a few cycles pointing out, and >> optionally >> > correcting, the items that are in need of improvement, we will certainly >> > incorporate. >> > >> > Thanks! >> > >> > On Mon, Nov 2, 2015 at 1:28 PM, Mark Petronic >> > wrote: >> > >> > > I'm sort of in the camp of "don't come with a complaint if you don't >> come >> > > with a solution" and hesitated to even raise the documentation comment >> > > without just fixing it myself. How about this, I just do some updates >> on >> > > some processor docs myself and use that as my first contribution to >> work >> > > through the process of committing to this project? >> > > >> > > But, to give you one quick example, EvaluateJSONPath (which, btw has >> > pretty >> > > good docs otherwise) does not mention HOW to extract the JSON you are >> > > interested in. I had to look at the code to figure out it used this: >> > > https://github.com/jayway/JsonPath. Ok, that was not hard, I admit, >> but, >> > > as >> > > a user, should I need to look at the code for such information? I >> submit, >> > > no. Me personally, I like to dig into the code. So, this is more a >> > comment >> > > on "overall goodness" for the general new user experience. >> > > >> > > I agree with your assessment of