Another crazy idea — would it be more computationally efficient to use NiFi’s 
REST API to add a new instance of the GetTwitter processor if a new endpoint 
was needed? Basically track using the state manager which terms are currently 
registered (a map of terms to processor IDs) and if a new term needs to be 
searched, duplicate an existing processor and replace the search term? They 
could all be located in a specific PG to allow for isolation from the 
“meta-flow” that is operating on NiFi itself.

Andy LoPresto
[email protected]
[email protected]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Aug 25, 2016, at 11:45 AM, Andy LoPresto <[email protected]> wrote:
> 
> Yeah I had a feeling there was a reason it didn’t support EL in the first 
> place but didn’t know enough of the context. Thanks Aldrin.
> 
> @Sven,
> 
> Writing a custom processor is always a good exercise. If you are familiar 
> with Python/Groovy/Ruby I would suggest prototyping with ExecuteScript to get 
> a feel for the processor lifecycle and very rapid development feedback loop, 
> and then transition to full-scale NAR development.
> 
> If you run into any roadblocks or have more in-depth questions, I would 
> recommend asking on the developer list as it is a bit more technical and some 
> of the experienced NiFi users (even those not on the core development team) 
> respond quickly to questions on that list.
> 
> Matt Burgess has written a number of articles about this that are very 
> helpful [1][2].
> 
> [1] 
> https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html 
> <https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html>
> [2] 
> https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html
>  
> <https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html>
> 
> Andy LoPresto
> [email protected] <mailto:[email protected]>
> [email protected] <mailto:[email protected]>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Aug 25, 2016, at 10:34 AM, Sven Davison <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Good to know. The more i think about this, the more it seems like a tech/dev 
>> version of the movie called 'pentagon wars'. Maybe a custom processor would 
>> serve a duel purpose.. getting it done.. and building my first custom 
>> processor.
>> 
>> On Thu, Aug 25, 2016 at 1:24 PM, Aldrin Piri <[email protected] 
>> <mailto:[email protected]>> wrote:
>> One consideration for why it does not support EL is due to client the 
>> processor is wrapping that registers with a given endpoint.  EL would 
>> require this disconnect/reconnection process to potentially happen on every 
>> FlowFile presented to the processor (some smart caching could certainly 
>> lessen the effect). Currently, filtering and such is very much integrated 
>> with the lifecycle of the processor.  A more dynamic processor could be 
>> achieved, but will come with a few caveats.
>> 
>> On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <[email protected] 
>> <mailto:[email protected]>> wrote:
>> thats, close to the same flow i was looking at really. but was chucked out 
>> for lack of EL support w/in GetTwitter. The good news is... we're learning!
>> 
>> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hi Sven,
>> 
>> Someone may have a more streamlined solution, but I’d suggest taking a look 
>> at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to 
>> convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to 
>> extract the specific values you are interested in. Then use UpdateAttribute 
>> [4] to populate those values from the flowfile content to an attribute, and 
>> finally use GetTwitter [5] to filter on those values.
>> 
>> However, at this time the query fields in GetTwitter do not support 
>> Expression Language, so you will have to:
>> 
>> * Modify the source of GetTwitter to support EL
>> * Raise a Jira requesting this feature
>> * Write a small script wrapping GetTwitter using ExecuteScript [6] to 
>> populate those values
>> 
>> Sorry it’s not a cleaner solution. I would encourage you to raise the Jira 
>> [7] to have GetTwitter support EL in the query properties. It’s likely I am 
>> overlooking a potential simpler flow, but without EL support in GetTwitter, 
>> I don’t see an easy way forward.
>> 
>> [1] 
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html
>>  
>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html>
>> [2] 
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html
>>  
>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html>
>> [3] 
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html
>>  
>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html>
>> [4] 
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>>  
>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html>
>> [5] 
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html
>>  
>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html>
>> [6] 
>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html
>>  
>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html>
>> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa 
>> <https://issues.apache.org/jira/secure/CreateIssue!default.jspa>
>> 
>> Andy LoPresto
>> [email protected] <mailto:[email protected]>
>> [email protected] <mailto:[email protected]>
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> 
>>> On Aug 25, 2016, at 9:10 AM, Sven Davison <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> i have a GetTwitter processor which works wonders. I'm tracking a few 
>>> people and a couple hash tags but i'm also pulling all hashtags out of the 
>>> posts and tracking how many times i saw it and when the last time was that 
>>> i saw it.
>>> 
>>> example tweet: "hello world #earth #usa"
>>> 
>>> if i'm watching #usa, i'll still get both tags and put them into my 
>>> database. using the tag as the id, a count for how many times it's been 
>>> seen and a lastSeen field for when it was last seen.
>>> 
>>> what i would like to do, is dynamically follow new tags upon condition X. 
>>> Say... once #earth gets more than 500 posts and only if the tag was seen in 
>>> the last 7 days. I can make a view in MySQL to build the result set, but 
>>> how do i get that result set into nifi, to follow those tags that will 
>>> change.
>>> 
>>> 
>> 
>> 
>> 
>> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to