RE: [EXT] Re: Correlate Processor ID in Logs
Pierre and Kevin, Thanks for your suggestions, based on your inputs maybe I can build a hybrid monitoring system which uses both SiteToSite Reporting Task and Bulletins through REST calls. -Karthik -Original Message- From: Pierre Villard [mailto:pierre.villard...@gmail.com] Sent: Tuesday, August 22, 2017 2:48 PM To: dev Subject: [EXT] Re: Correlate Processor ID in Logs Hi, I'd suggest to use the SiteToSite Bulletin Reporting Task as a way to monitor the bulletins generated by NiFi. If your reporting task is scheduled frequently enough, you shouldn't have any issue. Note that the "5 bulletins limit" is per processor. Thanks! 2017-08-22 22:43 GMT+02:00 Kevin Doran : > Hi Karthik, > > A processor's metadata, including its name and parent processor group > ID, are accessible via the NiFi REST API [1] via GET /processors/{id}, > which > returns: > > { > ... > "component": { > "id": "value", > "parentGroupId": "value", > "name": "value", > "type": "value", >... } > } > > Of course, hitting the API for every log line doesn't scale, so one > approach would be to build a local cache of processorId -> > processorMetadata in whatever log line processing tool you are using, > and use the cache in order to enrich each log line with the fields you > require. > You could build the cache lazily, i.e., start with an empty lookup > table, and if the processor ID is not in the cache, hit the REST API to look > it up. > > Regards, > Kevin > > [1] https://nifi.apache.org/docs/nifi-docs/rest-api/ > > On 8/22/17, 15:56, "Karthik Kothareddy (karthikk) [CONT - Type 2]" < > karth...@micron.com> wrote: > > Hello All, > > I am trying to build a monitoring mechanism for our flows and I'm > considering using the "nifi-app.log" as a primary source and filter > them based on the messages. However, I see that a particular message > only has Processor name and ID for example, > > ERROR [Timer-Driven Process Thread-36] > o.a.nifi.processors.standard.ExecuteSQL > ExecuteSQL[id=015a1007-548f-1bf5-1836-e4e53164d184] Unable to execute > SQL select query SELECT * FROM table WHERE comp_datetime <= > '2017-01-31 23:59:59.813' ORDER BY datetime OFFSET 32400 ROWS > FETCH NEXT 100 ROWS ONLY for > StandardFlowFileRecord[uuid=fc425c66-b83d-46d2-94bc- > 332e43345960,claim=StandardContentClaim [resourceClaim= > StandardResourceClaim[id=1499803802779-112000, container=default, > section=384], offset=265042, length=114613],offset=53992, > name=16290968101533439,size=167] > > Given the above Error message it is really hard to correlate the > ProcessorName/ID to the actual name of the Processor or it's parent > ProcessorGroup. Is there a way that I can correlate them easily? > > Also , I have considered using Bulletins as the source which is > more fine grained to the actual processor and ProcessorGroup it > belongs to but problem with this approach is the rest call only > returns 5 bulletins back each time. And according to this post > https://community.hortonworks. > com/questions/72411/nifi-bulletinrepository-api- > returns-maximum-5-bull.html it is a fixed value and practically not > feasible to capture all of them if the flow has multiple failures > every second. > > > Any thoughts around this are much appreciated. > > Thanks > Karthik > > > >
Re: Correlate Processor ID in Logs
Hi, I'd suggest to use the SiteToSite Bulletin Reporting Task as a way to monitor the bulletins generated by NiFi. If your reporting task is scheduled frequently enough, you shouldn't have any issue. Note that the "5 bulletins limit" is per processor. Thanks! 2017-08-22 22:43 GMT+02:00 Kevin Doran : > Hi Karthik, > > A processor's metadata, including its name and parent processor group ID, > are accessible via the NiFi REST API [1] via GET /processors/{id}, which > returns: > > { > ... > "component": { > "id": "value", > "parentGroupId": "value", > "name": "value", > "type": "value", >... } > } > > Of course, hitting the API for every log line doesn't scale, so one > approach would be to build a local cache of processorId -> > processorMetadata in whatever log line processing tool you are using, and > use the cache in order to enrich each log line with the fields you require. > You could build the cache lazily, i.e., start with an empty lookup table, > and if the processor ID is not in the cache, hit the REST API to look it up. > > Regards, > Kevin > > [1] https://nifi.apache.org/docs/nifi-docs/rest-api/ > > On 8/22/17, 15:56, "Karthik Kothareddy (karthikk) [CONT - Type 2]" < > karth...@micron.com> wrote: > > Hello All, > > I am trying to build a monitoring mechanism for our flows and I'm > considering using the "nifi-app.log" as a primary source and filter them > based on the messages. However, I see that a particular message only has > Processor name and ID for example, > > ERROR [Timer-Driven Process Thread-36] > o.a.nifi.processors.standard.ExecuteSQL > ExecuteSQL[id=015a1007-548f-1bf5-1836-e4e53164d184] Unable to execute SQL > select query SELECT * FROM table WHERE comp_datetime <= '2017-01-31 > 23:59:59.813' ORDER BY datetime OFFSET 32400 ROWS FETCH NEXT 100 > ROWS ONLY for StandardFlowFileRecord[uuid=fc425c66-b83d-46d2-94bc- > 332e43345960,claim=StandardContentClaim [resourceClaim= > StandardResourceClaim[id=1499803802779-112000, container=default, > section=384], offset=265042, length=114613],offset=53992, > name=16290968101533439,size=167] > > Given the above Error message it is really hard to correlate the > ProcessorName/ID to the actual name of the Processor or it's parent > ProcessorGroup. Is there a way that I can correlate them easily? > > Also , I have considered using Bulletins as the source which is more > fine grained to the actual processor and ProcessorGroup it belongs to but > problem with this approach is the rest call only returns 5 bulletins back > each time. And according to this post https://community.hortonworks. > com/questions/72411/nifi-bulletinrepository-api- > returns-maximum-5-bull.html it is a fixed value and practically not > feasible to capture all of them if the flow has multiple failures every > second. > > > Any thoughts around this are much appreciated. > > Thanks > Karthik > > > >
Re: Correlate Processor ID in Logs
Hi Karthik, A processor's metadata, including its name and parent processor group ID, are accessible via the NiFi REST API [1] via GET /processors/{id}, which returns: { ... "component": { "id": "value", "parentGroupId": "value", "name": "value", "type": "value", ... } } Of course, hitting the API for every log line doesn't scale, so one approach would be to build a local cache of processorId -> processorMetadata in whatever log line processing tool you are using, and use the cache in order to enrich each log line with the fields you require. You could build the cache lazily, i.e., start with an empty lookup table, and if the processor ID is not in the cache, hit the REST API to look it up. Regards, Kevin [1] https://nifi.apache.org/docs/nifi-docs/rest-api/ On 8/22/17, 15:56, "Karthik Kothareddy (karthikk) [CONT - Type 2]" wrote: Hello All, I am trying to build a monitoring mechanism for our flows and I'm considering using the "nifi-app.log" as a primary source and filter them based on the messages. However, I see that a particular message only has Processor name and ID for example, ERROR [Timer-Driven Process Thread-36] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=015a1007-548f-1bf5-1836-e4e53164d184] Unable to execute SQL select query SELECT * FROM table WHERE comp_datetime <= '2017-01-31 23:59:59.813' ORDER BY datetime OFFSET 32400 ROWS FETCH NEXT 100 ROWS ONLY for StandardFlowFileRecord[uuid=fc425c66-b83d-46d2-94bc-332e43345960,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1499803802779-112000, container=default, section=384], offset=265042, length=114613],offset=53992,name=16290968101533439,size=167] Given the above Error message it is really hard to correlate the ProcessorName/ID to the actual name of the Processor or it's parent ProcessorGroup. Is there a way that I can correlate them easily? Also , I have considered using Bulletins as the source which is more fine grained to the actual processor and ProcessorGroup it belongs to but problem with this approach is the rest call only returns 5 bulletins back each time. And according to this post https://community.hortonworks.com/questions/72411/nifi-bulletinrepository-api-returns-maximum-5-bull.html it is a fixed value and practically not feasible to capture all of them if the flow has multiple failures every second. Any thoughts around this are much appreciated. Thanks Karthik
Correlate Processor ID in Logs
Hello All, I am trying to build a monitoring mechanism for our flows and I'm considering using the "nifi-app.log" as a primary source and filter them based on the messages. However, I see that a particular message only has Processor name and ID for example, ERROR [Timer-Driven Process Thread-36] o.a.nifi.processors.standard.ExecuteSQL ExecuteSQL[id=015a1007-548f-1bf5-1836-e4e53164d184] Unable to execute SQL select query SELECT * FROM table WHERE comp_datetime <= '2017-01-31 23:59:59.813' ORDER BY datetime OFFSET 32400 ROWS FETCH NEXT 100 ROWS ONLY for StandardFlowFileRecord[uuid=fc425c66-b83d-46d2-94bc-332e43345960,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1499803802779-112000, container=default, section=384], offset=265042, length=114613],offset=53992,name=16290968101533439,size=167] Given the above Error message it is really hard to correlate the ProcessorName/ID to the actual name of the Processor or it's parent ProcessorGroup. Is there a way that I can correlate them easily? Also , I have considered using Bulletins as the source which is more fine grained to the actual processor and ProcessorGroup it belongs to but problem with this approach is the rest call only returns 5 bulletins back each time. And according to this post https://community.hortonworks.com/questions/72411/nifi-bulletinrepository-api-returns-maximum-5-bull.html it is a fixed value and practically not feasible to capture all of them if the flow has multiple failures every second. Any thoughts around this are much appreciated. Thanks Karthik