Teresa, Architecture diagrams: For the documents you suggest I think it is one of those things you'll just have to show what you mean. It isn't really clear what role UML or other such diagrams could play. As a general statement for anything you want to contribute it will just be good to have in mind how you think it will help the project. One of the things we can do as a community is document a set of goals for the project so that folks understand where/how they can contribute or where to suggest an expansion of goals
Provenance/Reporting: For the suggestions on provenance I agree this is an important area of focus. I'll respond more completely on your ticket NIFI-252. For this thread here though I'll just mention that the design does indeed already provide for these things to occur. We just haven't gotten to implementation of that vision. We also need to be very cognizant of where the lanes should end. There are a lot of great solutions out there to take in the sorts of data we expose that would do a great job of ingesting/indexing/querying/analyzing. We need to be complimentary to those things and build in 'just enough' to NiFi for parts that are unique to it. Your additional thought seems to be along these lines too so that is great. Thanks Joe On Tue, Jan 13, 2015 at 12:41 PM, Teresa Jackson <[email protected]> wrote: > Hi Tony, > > Not exactly, the primary target audience for this architecture view would > be for business executives and system architects. I'm targeting folks who > are looking for an enterprise view or who are seeking to understand and > come up to speed on how the Core Framework works. > > And the docs that I have seen and reviewed to date don't seem to fit that > particular target audience. > > Also, what's the process to propose a design idea for the Core Framework? > In reviewing some of the source code, I didn't see any software packages > that supported metrics needs. > > I'd like to propose an addition or enhancement be made to the Core to > support volume management, trend analysis by way of databasing attributes > and content so that it is query-able and made available for display. This > information would then be used for statistical roll ups, metrics, trend > analysis, etc.. > > Ideally, we'd do it by capturing running totals by receiving copies of > local provenance events. This component would be like local provenance in > that it would retain the data for some configurable period of time, based > on the amount of disk space allocated for that process. In addition, these > roll ups could be sent somewhere for even longer retention. > > The goal is to keep as many hooks as possible to making it possible for > other programs/services to ingest both the local provenance logs, and the > rolled up summaries. There's a growing base of people who are comfortable > with NIFI graphs, and local provenance, so I think that it makes sense to > build off that. > > The issue I'm facing is that Provenance is fine for tracking one file if > you have a starting point, but it is not designed to do counting, > summarization and correlation of data. And it doesn't support advanced > queries. > > Here are some of the most immediate and pressing use cases for this design. > > 1. How much traffic came in yesterday (or last week)? > 2. Provide statistical counts on items of interest within a flow for a > given flow/date range. > 3. When was the last file sent to "System X"? > 4. Did anything get sent to "System Y"? > 5. How much data was marked with a certain tag? > 6. How much data was scanned? > 7. How much data was detected? > 8. How much of a particular type of data was received in bytes? > 9. How much data was processed by file count? > > Another thought: > > This might also be a good place for hooking streaming services where you > can deal with the raw events and then summarize/aggregate when things go by. > > I'm completely new to this process so I don't know if basic concept > proposals of this sort should come in the form of an architecture diagram > or simply in plain English. > > Thanks, > > Teresa Jackson > Onyx Consulting Services, LLC > Chief Engineer > > ________________________________________ > From: Tony Kurc <[email protected]> > Sent: Monday, January 12, 2015 10:28 PM > To: [email protected] > Subject: Re: Core Components > > Teresa, > Glad you're interested in contributing. I suggest reading some of the > guides apache has [1] on what to expect when getting involved, which should > answer some of the questions about vetting and the board (which I inferred > to mean the PPMC) > > Were you planning on doing this for developer documentation to get > developers up to speed more quickly [2]? Thus far the documentation has > been developed with asciidoc [3], I certainly had some degree of > expectation the developer guide to have followed this path also. Were you > expecting to build images from the UML or other tool to include in a guide? > Or were you thinking it may be useful to have UML outside the context of a > developer documentation guide? > > [1] http://www.apache.org/foundation/getinvolved.html > [2] https://issues.apache.org/jira/browse/NIFI-152 > [3] > > http://apache-nifi-incubating-developer-list.39713.n7.nabble.com/User-Guide-td46.html > > Tony > > On Mon, Jan 12, 2015 at 8:05 PM, Teresa Jackson <[email protected] > > > wrote: > > > Hello everyone, > > > > I'm reviewing the Apache-NiFi source code and would like to put together > > some architecture diagrams of the framework's core components. What's the > > required format for submission (UML, DODAF 2.0, et. al)? Also, what's the > > vetting process? And are there tools/approaches/processes that the board > > prefer be used? > > > > Thanks, > > > > Teresa Jackson > > Onyx Consulting Services, LLC > > Chief Engineer > > >
