Another crazy idea — would it be more computationally efficient to use NiFi’s REST API to add a new instance of the GetTwitter processor if a new endpoint was needed? Basically track using the state manager which terms are currently registered (a map of terms to processor IDs) and if a new term needs to be searched, duplicate an existing processor and replace the search term? They could all be located in a specific PG to allow for isolation from the “meta-flow” that is operating on NiFi itself.
Andy LoPresto [email protected] [email protected] PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > On Aug 25, 2016, at 11:45 AM, Andy LoPresto <[email protected]> wrote: > > Yeah I had a feeling there was a reason it didn’t support EL in the first > place but didn’t know enough of the context. Thanks Aldrin. > > @Sven, > > Writing a custom processor is always a good exercise. If you are familiar > with Python/Groovy/Ruby I would suggest prototyping with ExecuteScript to get > a feel for the processor lifecycle and very rapid development feedback loop, > and then transition to full-scale NAR development. > > If you run into any roadblocks or have more in-depth questions, I would > recommend asking on the developer list as it is a bit more technical and some > of the experienced NiFi users (even those not on the core development team) > respond quickly to questions on that list. > > Matt Burgess has written a number of articles about this that are very > helpful [1][2]. > > [1] > https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html > <https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html> > [2] > https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html > > <https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html> > > Andy LoPresto > [email protected] <mailto:[email protected]> > [email protected] <mailto:[email protected]> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > >> On Aug 25, 2016, at 10:34 AM, Sven Davison <[email protected] >> <mailto:[email protected]>> wrote: >> >> Good to know. The more i think about this, the more it seems like a tech/dev >> version of the movie called 'pentagon wars'. Maybe a custom processor would >> serve a duel purpose.. getting it done.. and building my first custom >> processor. >> >> On Thu, Aug 25, 2016 at 1:24 PM, Aldrin Piri <[email protected] >> <mailto:[email protected]>> wrote: >> One consideration for why it does not support EL is due to client the >> processor is wrapping that registers with a given endpoint. EL would >> require this disconnect/reconnection process to potentially happen on every >> FlowFile presented to the processor (some smart caching could certainly >> lessen the effect). Currently, filtering and such is very much integrated >> with the lifecycle of the processor. A more dynamic processor could be >> achieved, but will come with a few caveats. >> >> On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <[email protected] >> <mailto:[email protected]>> wrote: >> thats, close to the same flow i was looking at really. but was chucked out >> for lack of EL support w/in GetTwitter. The good news is... we're learning! >> >> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Sven, >> >> Someone may have a more streamlined solution, but I’d suggest taking a look >> at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to >> convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to >> extract the specific values you are interested in. Then use UpdateAttribute >> [4] to populate those values from the flowfile content to an attribute, and >> finally use GetTwitter [5] to filter on those values. >> >> However, at this time the query fields in GetTwitter do not support >> Expression Language, so you will have to: >> >> * Modify the source of GetTwitter to support EL >> * Raise a Jira requesting this feature >> * Write a small script wrapping GetTwitter using ExecuteScript [6] to >> populate those values >> >> Sorry it’s not a cleaner solution. I would encourage you to raise the Jira >> [7] to have GetTwitter support EL in the query properties. It’s likely I am >> overlooking a potential simpler flow, but without EL support in GetTwitter, >> I don’t see an easy way forward. >> >> [1] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html> >> [2] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html> >> [3] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html> >> [4] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html> >> [5] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html> >> [6] >> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html >> >> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html> >> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa >> <https://issues.apache.org/jira/secure/CreateIssue!default.jspa> >> >> Andy LoPresto >> [email protected] <mailto:[email protected]> >> [email protected] <mailto:[email protected]> >> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 >> >>> On Aug 25, 2016, at 9:10 AM, Sven Davison <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> i have a GetTwitter processor which works wonders. I'm tracking a few >>> people and a couple hash tags but i'm also pulling all hashtags out of the >>> posts and tracking how many times i saw it and when the last time was that >>> i saw it. >>> >>> example tweet: "hello world #earth #usa" >>> >>> if i'm watching #usa, i'll still get both tags and put them into my >>> database. using the tag as the id, a count for how many times it's been >>> seen and a lastSeen field for when it was last seen. >>> >>> what i would like to do, is dynamically follow new tags upon condition X. >>> Say... once #earth gets more than 500 posts and only if the tag was seen in >>> the last 7 days. I can make a view in MySQL to build the result set, but >>> how do i get that result set into nifi, to follow those tags that will >>> change. >>> >>> >> >> >> >> >
signature.asc
Description: Message signed with OpenPGP using GPGMail
