Hi All,
I am looking for a way to obtain the total amount of data that has been
processed by a running cluster for a period of time, ideally via the rest
api.
Example of my use case:
I have say 50 different process groups, each that have a connection to some
data source. Each one is continuously pulling data in, doing something to
it, then sending it out to some other external place. I'd like to
programmatically gather some metrics about the amount of data flowing thru
the cluster as a whole (everything that is running across the cluster).
It looks like the following api may be the solution, but I am curious about
some of the properties:
"nifi-api/flow/process-groups/root/status?recursive=true".
Looking at the data model (as defined in the rest api documentation) and
the actual data that is returned, my questions are:
1. Would this be the correct way to obtain this information?
2. And if so, I'm not sure which properties to look at as it isn't
immediately clear to me the difference between some of them. Example being
"bytesSent" vs "bytesOut".
3. How is this data updated? It looks like a lot of these metrics are
supposed to updated every 5 minutes. So would it be that the info I would
get now is what was collected from the last 5 minute interval and would
stay the same until the next 5 minute interval? And does the data aggregate
or is it only representative of a single 5 minute period? Something else?
{
"processGroupStatus": {
...
"aggregateSnapshot": {
...
"flowFilesIn": 0,
"bytesIn": 0,
"input": "value",
"flowFilesQueued": 0,
"bytesQueued": 0,
"queued": "value",
"queuedCount": "value",
"queuedSize": "value",
"bytesRead": 0,
"read": "value",
"bytesWritten": 0,
"written": "value",
"flowFilesOut": 0,
"bytesOut": 0,
"output": "value",
"flowFilesTransferred": 0,
"bytesTransferred": 0,
"transferred": "value",
"bytesReceived": 0, // I think this is the one, but not sure
"flowFilesReceived": 0,
"received": "value",
"bytesSent": 0, // I think this is the other one, but not sure
"flowFilesSent": 0,
"sent": "value",
"activeThreadCount": 0,
"terminatedThreadCount": 0
},
"nodeSnapshots": [{…}]
},
"canRead": true
}
Any help or insight is always appreciated!
Cheers,
Ryan H.