The ExecuteScript processor Aldrin mentioned can handle Python scripts using Jython so you can execute it in NiFi. There’s a good article from Matt Burgess on handling this [1].
[1] https://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html <https://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html> Andy LoPresto [email protected] [email protected] PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > On Jun 22, 2016, at 2:02 PM, Shaun McAdams <[email protected]> wrote: > > I’m working through a messy regex where I attempting to replace contents in > capture group 3 based on capture groups 1 and 2. Again, I have a python > script to do all the steps (in spark instance) that I can send this out to > and back. > My attempt here was to see if I could figure out a solution in HDF itself. > However as you say, I may just create our own processor or fork a current. I > appreciate the feedback. Thanks. > > > -- > Shaun McAdams > > From: Aldrin Piri > Reply-To: "[email protected] <mailto:[email protected]>" > Date: Wednesday, June 22, 2016 at 4:31 PM > To: "[email protected] <mailto:[email protected]>" > Subject: Re: Nifi Combiner Processor? > > Hi Shaun, > > Thanks for the additional information. Apologies for not being able to > follow up yesterday. > > As you mentioned, combining all those discrete entries into a shared header > would not fall under the capabilities of MergeContent nor do we have > something that maps directly short of the ReplaceText option Andy mentioned > but could work out to be a messy regex. If you have scripting savvy, > InvokeScriptedProcessor/ExecuteScript could be good candidates to do a script > in one of the supported options to take one of the files from MergeContent as > an alternative to the ReplaceText. There is also the ability to extend the > framework with custom extensions for cases like these where users start > getting into particular formats of data for their systems. > > On Wed, Jun 22, 2016 at 1:22 PM, Andy LoPresto <[email protected] > <mailto:[email protected]>> wrote: > Shaun, > > Following the MergeContent, you can do a ReplaceText transform to remove the > duplicate message headers by matching on a regular expression. > > Andy LoPresto > [email protected] <mailto:[email protected]> > [email protected] <mailto:[email protected]> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > >> On Jun 22, 2016, at 9:43 AM, Shaun McAdams <[email protected] >> <mailto:[email protected]>> wrote: >> >> Aldrin, >> >> I checked it out. Appears that merge will concat all the lines not just >> portions of it as my request would desire. >> >> -- >> Shaun McAdams >> >> >> From: Shaun McAdams >> Reply-To: "[email protected] <mailto:[email protected]>" >> Date: Tuesday, June 21, 2016 at 10:30 AM >> To: "[email protected] <mailto:[email protected]>" >> Subject: Re: Nifi Combiner Processor? >> >> Thanks. I’ll look through this information. >> >> To answer your question, the format is here: >> https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/tmos_management_guide_10_1/tmos_logging.html >> >> <https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/tmos_management_guide_10_1/tmos_logging.html> >> >> And this is what I have that was sent to me: >> >> --------------- >> These events always come through in a burst. There won’t be much, if any, >> lookahead or lookbehind required. Here is an example of a few events and >> how we believe the transform would look: >> >> BEFORE >> 2016-06-08T15:54:44+00:00 xxxxxxxxxxxxxx.com <http://xxxxxxxxxxxxxx.com/> >> apd[7764] 01490007:6: f42579b7: Session variable 'session.policy.result' set >> to 'allow' >> 2016-06-08T15:54:44+00:00 xxxxxxxxxxxxxx.com <http://xxxxxxxxxxxxxx.com/> >> apd[7764] 01490007:6: f42579b7: Session variable >> 'session.logon.page.errorcode' set to '0' >> 2016-06-08T15:54:44+00:00 xxxxxxxxxxxxxx.com <http://xxxxxxxxxxxxxx.com/> >> apd[7764] 01490007:6: f42579b7: Session variable 'session.assigned.webtop' >> set to '/Common/mobilevpn_webtop' >> 2016-06-08T15:54:44+00:00 xxxxxxxxxxxxxx.com <http://xxxxxxxxxxxxxx.com/> >> apd[7764] 01490007:6: f42579b7: Session variable 'session.assigned.uuid' set >> to 'tmm.uuid./Common/mobilevpn_access-policy.C069850' >> 2016-06-08T15:54:44+00:00 xxxxxxxxxxxxxx.com <http://xxxxxxxxxxxxxx.com/> >> apd[7764] 01490007:6: f42579b7: Session variable >> 'session.assigned.resources.na <http://session.assigned.resources.na/>' set >> to '/Common/mobilevpn_network-acl' >> 2016-06-08T15:54:44+00:00 xxxxxxxxxxxxxx.com <http://xxxxxxxxxxxxxx.com/> >> apd[7764] 01490007:6: f42579b7: Session variable 'session.assigned.acls' set >> to '/Common/mobilevpn_acl' >> >> AFTER >> 2016-06-08T15:54:44+00:00 xxxxxxxxxxxxxx.com <http://xxxxxxxxxxxxxx.com/> >> apd[7764] 01490007:6: f42579b7: Session variables 'session.policy.result' >> set to ‘allow' 'session.logon.page.errorcode' set to ‘0' >> 'session.assigned.webtop' set to '/Common/mobilevpn_webtop’ >> 'session.assigned.uuid' set to >> 'tmm.uuid./Common/mobilevpn_access-policy.C069850’ >> 'session.assigned.resources.na <http://session.assigned.resources.na/>' set >> to '/Common/mobilevpn_network-acl’ 'session.assigned.acls' set to >> '/Common/mobilevpn_acl' >> >> ---------------- >> >> Thanks. >> -- >> Shaun McAdams >> >> From: Aldrin Piri >> Reply-To: "[email protected] <mailto:[email protected]>" >> Date: Tuesday, June 21, 2016 at 10:04 AM >> To: "[email protected] <mailto:[email protected]>" >> Subject: Re: Nifi Combiner Processor? >> >> Hi Shaun, >> >> While there is no explicit processor that will carry this out in one action, >> I believe we have the tools in place for you to accomplish the functionality >> with our standard processors. >> >> Not sure I have your exact case, but the way I approached this is through >> the following sample data: >> >> 1,Session1,<other data> >> 2,Session1,<other data> >> 3,Session2,<other data> >> 4,Session4,<other data> >> >> transforming to 3 resultant groupings: >> >> 1,Session1,<other data> >> 2,Session1,<other data> >> >> 3,Session2,<other data> >> >> 4,Session4,<other data> >> >> I think SplitText[1] and ExtractText[2] with MergeContent[3] (optionally) >> may be able to help you with your case. SplitText would break incoming data >> into a single event line. ExtractText would be able to find your session >> variable from the line and promote it to an attribute. This attribute could >> then be used for the 'Correlation Attribute Name' to group each of the >> separate lines together. I am a little unclear on the "want multiple lines >> carrying a session variable to be group one session variables line," but >> this probably gets us close if the interpretation was incorrect. >> >> Feel free to provide some sample data (I'm not familiar with the F5 log >> format) or some additional details if this comes up a bit short. >> >> [1] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitText/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.SplitText/index.html> >> [2] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExtractText/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExtractText/index.html> >> [3] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/index.html> >> >> On Tue, Jun 21, 2016 at 9:48 AM, Shaun McAdams <[email protected] >> <mailto:[email protected]>> wrote: >> Hey users, >> >> I was sent a request for a splunk use case to lower some of the volume going >> to enterprise spunk. Data is from an F5 (log). Easily enough they want some >> data dropped, however they also want multiple lines carrying a session >> variable to be group one session variables line. I don’t see a >> implementation of such a combiner in Nifi itself and want to make sure I’m >> not overlooking something. It appears I need to site-to-site this to a >> spark instance running the combiner. (as one possible solution for them). >> Wondered if anyone else had implemented such a use case. >> >> Thanks. >> -- >> Shaun McAdams >> >> <http://www.moserit.com/> >> Web <http://www.moserit.com/> | Twitter <http://j.mp/mosertwitter> | >> Facebook <http://j.mp/moserfacebook> | LinkedIn <http://j.mp/moserlinkedin> >> | Google+ <http://j.mp/mosergoogle> >> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is >> for the sole use of the intended recipient(s) and may contain confidential >> and privileged information or may otherwise be protected by law. Any >> unauthorized review, use, disclosure or distribution is prohibited. If you >> are not the intended recipient, please contact the sender by reply e-mail >> and destroy all copies of the original message and any attachment thereto. >> > >
signature.asc
Description: Message signed with OpenPGP using GPGMail
