[ 
https://issues.apache.org/jira/browse/NIFI-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224415#comment-15224415
 ] 

Joseph Percivall commented on NIFI-1582:
----------------------------------------

[~jskora], the state changes offer an option to do "stateless" which would keep 
the exact same logic as before. So the default performance shouldn't be 
affected at all by the introduction of the state option.

The batching change is actually bringing it in align with the updates on 
batching as a whole.  Where instead of forcing a batch size of 100 at all 
times, it allows the user to use the batching configuration in the scheduling 
tab of the processor config. This could potentially lead to an increase in 
performance. 

As for the performance of state, it could vary dramatically between local and 
clustered state. Local state is just working with an in memory map so the 
performance shouldn't be significantly affected. Clustered state is getting and 
pushing the entire map to a zookeeper instance which could cause performance 
impacts.

That being said, I've run it on my macbook with local and clustered (ZK 
instance on same box) state and seen very good performance.

> New processor to update attributes with state
> ---------------------------------------------
>
>                 Key: NIFI-1582
>                 URL: https://issues.apache.org/jira/browse/NIFI-1582
>             Project: Apache NiFi
>          Issue Type: New Feature
>            Reporter: Joseph Percivall
>            Assignee: Joseph Percivall
>
> This idea was sparked by a thread on the user list and should allow basic 
> data science:
> I expect that in the future I’ll need something a little more sophisticated 
> but for now my problem is very simple:
> I want to be able to trigger an alert (only once) when an attribute in an 
> incoming stream, for instance, goes over a predefined threshold. The 
> Processor should then trigger (only once again) another trigger when the 
> signal goes back to normal (below threshold). Basically a RouteByAttribute 
> but with memory.
> Thanks 
> Claudio
> ------------------------------------------------
> Hello Claudio,
> Your use-case actually could leverage a couple of recently added features to 
> create a really cool open-source processor. The two key features that were 
> added are State Management and the ability to reference processor specific 
> variables in expression language. You can take a look at RouteText to see 
> both in action. 
> By utilizing both you can create a processor that is configured with multiple 
> Expression language expressions. There would be dynamic properties which 
> would accept expression language and then store the evaluated value via state 
> management. Then there would be a routing property (that supports expression 
> language) that could simply add an attribute to the flowfile with the 
> evaluated value which would allow it to be used by flowing processors for 
> routing.
> This would allow you to do your use-case where you store the value for the 
> incoming stream and route differently once you go over a threshold. It could 
> even allow more complex use-cases. One instance, I believe, would be possible 
> is to have a running average and standard deviation and route data to 
> different locations based on it's standard deviation.
> You can think of this like an UpdateAttribute with the ability to store and 
> calculate variables using expression language.
> Joe



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to