Teresa Jackson created NIFI-253:
-----------------------------------
Summary: Enhancement to the Core to support metrics
Key: NIFI-253
URL: https://issues.apache.org/jira/browse/NIFI-253
Project: Apache NiFi
Issue Type: Wish
Components: Core Framework
Affects Versions: 0.0.1
Reporter: Teresa Jackson
I'd like to propose an addition or enhancement be made to the Core to support
volume management, trend analysis by way of databasing attributes and content
so that it is query-able and made available for display. This information would
then be used for statistical roll ups, metrics, trend analysis, etc..
Ideally, we'd do it by capturing running totals by receiving copies of local
provenance events. This component would be like local provenance in that it
would retain the data for some configurable period of time, based on the amount
of disk space allocated for that process. In addition, these roll ups could be
sent somewhere for even longer retention.
The goal is to keep as many hooks as possible to making it possible for other
programs/services to ingest both the local provenance logs, and the rolled up
summaries. There's a growing base of people who are comfortable with NIFI
graphs, and local provenance, so I think that it makes sense to build off that.
The issue I'm facing is that Provenance is fine for tracking one file if you
have a starting point, but it is not designed to do counting, summarization and
correlation of data. And it doesn't support advanced queries.
Here are some of the most immediate and pressing use cases for this design.
1. How much traffic came in yesterday (or last week)?
2. Provide statistical counts on items of interest within a flow for a given
flow/date range.
3. When was the last file sent to "System X"?
4. Did anything get sent to "System Y"?
5. How much data was marked with a certain tag?
6. How much data was scanned?
7. How much data was detected?
8. How much of a particular type of data was received in bytes?
9. How much data was processed by file count?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)