Teresa Jackson created NIFI-253:
-----------------------------------

             Summary: Enhancement to the Core to support metrics 
                 Key: NIFI-253
                 URL: https://issues.apache.org/jira/browse/NIFI-253
             Project: Apache NiFi
          Issue Type: Wish
          Components: Core Framework
    Affects Versions: 0.0.1
            Reporter: Teresa Jackson


I'd like to propose an addition or enhancement be made to the Core to support 
volume management, trend analysis by way of databasing attributes and content 
so that it is query-able and made available for display. This information would 
then be used for statistical roll ups, metrics, trend analysis, etc..

Ideally, we'd do it by capturing running totals by receiving copies of local 
provenance events.  This component would be like local provenance in that it 
would retain the data for some configurable period of time, based on the amount 
of disk space allocated for that process.  In addition, these roll ups could be 
sent somewhere for even longer retention.

The goal is to keep as many hooks as possible to making it possible for other 
programs/services to ingest both the local provenance logs, and the rolled up 
summaries.  There's a growing base of people who are comfortable with NIFI 
graphs, and local provenance, so I think that it makes sense to build off that.

The issue I'm facing is that Provenance is fine for tracking one file if you 
have a starting point, but it is not designed to do counting, summarization and 
correlation of data. And it doesn't support advanced queries.

Here are some of the most immediate and pressing use cases for this design.

1.  How much traffic came in yesterday (or last week)?
2. Provide statistical counts on items of interest within a flow for a given 
flow/date range.
3.  When was the last file sent to "System X"?
4. Did anything get sent to "System Y"?
5. How much data was marked with a certain tag?
6. How much data was scanned?
7. How much data was detected?
8. How much of a particular type of data was received in bytes?
9. How much data was processed by file count?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to