And of course the Developer Guide [1] and Contributor Guide [2] on the NiFi 
site.

[1] https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html
[2] https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide 
<https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide>


Andy LoPresto
[email protected]
[email protected]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Aug 25, 2016, at 11:48 AM, Andy LoPresto <[email protected]> wrote:
> 
> Another crazy idea — would it be more computationally efficient to use NiFi’s 
> REST API to add a new instance of the GetTwitter processor if a new endpoint 
> was needed? Basically track using the state manager which terms are currently 
> registered (a map of terms to processor IDs) and if a new term needs to be 
> searched, duplicate an existing processor and replace the search term? They 
> could all be located in a specific PG to allow for isolation from the 
> “meta-flow” that is operating on NiFi itself.
> 
> Andy LoPresto
> [email protected] <mailto:[email protected]>
> [email protected] <mailto:[email protected]>
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Aug 25, 2016, at 11:45 AM, Andy LoPresto <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Yeah I had a feeling there was a reason it didn’t support EL in the first 
>> place but didn’t know enough of the context. Thanks Aldrin.
>> 
>> @Sven,
>> 
>> Writing a custom processor is always a good exercise. If you are familiar 
>> with Python/Groovy/Ruby I would suggest prototyping with ExecuteScript to 
>> get a feel for the processor lifecycle and very rapid development feedback 
>> loop, and then transition to full-scale NAR development.
>> 
>> If you run into any roadblocks or have more in-depth questions, I would 
>> recommend asking on the developer list as it is a bit more technical and 
>> some of the experienced NiFi users (even those not on the core development 
>> team) respond quickly to questions on that list.
>> 
>> Matt Burgess has written a number of articles about this that are very 
>> helpful [1][2].
>> 
>> [1] 
>> https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html
>>  
>> <https://funnifi.blogspot.com/2016/02/executescript-processor-hello-world.html>
>> [2] 
>> https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html
>>  
>> <https://funnifi.blogspot.com/2016/02/writing-reusable-scripted-processors-in.html>
>> 
>> Andy LoPresto
>> [email protected] <mailto:[email protected]>
>> [email protected] <mailto:[email protected]>
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> 
>>> On Aug 25, 2016, at 10:34 AM, Sven Davison <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Good to know. The more i think about this, the more it seems like a 
>>> tech/dev version of the movie called 'pentagon wars'. Maybe a custom 
>>> processor would serve a duel purpose.. getting it done.. and building my 
>>> first custom processor.
>>> 
>>> On Thu, Aug 25, 2016 at 1:24 PM, Aldrin Piri <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> One consideration for why it does not support EL is due to client the 
>>> processor is wrapping that registers with a given endpoint.  EL would 
>>> require this disconnect/reconnection process to potentially happen on every 
>>> FlowFile presented to the processor (some smart caching could certainly 
>>> lessen the effect). Currently, filtering and such is very much integrated 
>>> with the lifecycle of the processor.  A more dynamic processor could be 
>>> achieved, but will come with a few caveats.
>>> 
>>> On Thu, Aug 25, 2016 at 1:03 PM, Sven Davison <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> thats, close to the same flow i was looking at really. but was chucked out 
>>> for lack of EL support w/in GetTwitter. The good news is... we're learning!
>>> 
>>> On Thu, Aug 25, 2016 at 12:52 PM, Andy LoPresto <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Hi Sven,
>>> 
>>> Someone may have a more streamlined solution, but I’d suggest taking a look 
>>> at ExecuteSQL [1] to read from the database, ConvertAvroToJSON [2] to 
>>> convert the output of the SQL query to JSON, and EvaluateJsonPath [3] to 
>>> extract the specific values you are interested in. Then use UpdateAttribute 
>>> [4] to populate those values from the flowfile content to an attribute, and 
>>> finally use GetTwitter [5] to filter on those values.
>>> 
>>> However, at this time the query fields in GetTwitter do not support 
>>> Expression Language, so you will have to:
>>> 
>>> * Modify the source of GetTwitter to support EL
>>> * Raise a Jira requesting this feature
>>> * Write a small script wrapping GetTwitter using ExecuteScript [6] to 
>>> populate those values
>>> 
>>> Sorry it’s not a cleaner solution. I would encourage you to raise the Jira 
>>> [7] to have GetTwitter support EL in the query properties. It’s likely I am 
>>> overlooking a potential simpler flow, but without EL support in GetTwitter, 
>>> I don’t see an easy way forward.
>>> 
>>> [1] 
>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html
>>>  
>>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html>
>>> [2] 
>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html
>>>  
>>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.avro.ConvertAvroToJSON/index.html>
>>> [3] 
>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html
>>>  
>>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html>
>>> [4] 
>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html
>>>  
>>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.attributes.UpdateAttribute/index.html>
>>> [5] 
>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html
>>>  
>>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/index.html>
>>> [6] 
>>> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html
>>>  
>>> <https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html>
>>> [7] https://issues.apache.org/jira/secure/CreateIssue!default.jspa 
>>> <https://issues.apache.org/jira/secure/CreateIssue!default.jspa>
>>> 
>>> Andy LoPresto
>>> [email protected] <mailto:[email protected]>
>>> [email protected] <mailto:[email protected]>
>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>> 
>>>> On Aug 25, 2016, at 9:10 AM, Sven Davison <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> i have a GetTwitter processor which works wonders. I'm tracking a few 
>>>> people and a couple hash tags but i'm also pulling all hashtags out of the 
>>>> posts and tracking how many times i saw it and when the last time was that 
>>>> i saw it.
>>>> 
>>>> example tweet: "hello world #earth #usa"
>>>> 
>>>> if i'm watching #usa, i'll still get both tags and put them into my 
>>>> database. using the tag as the id, a count for how many times it's been 
>>>> seen and a lastSeen field for when it was last seen.
>>>> 
>>>> what i would like to do, is dynamically follow new tags upon condition X. 
>>>> Say... once #earth gets more than 500 posts and only if the tag was seen 
>>>> in the last 7 days. I can make a view in MySQL to build the result set, 
>>>> but how do i get that result set into nifi, to follow those tags that will 
>>>> change.
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to