Thanks Bryan - Should I be assuming that my service’s local map needs to be thread-safe, or would all service calls likely to be executed from within a single thread? I assume the former but want to be sure.
Assuming that thread-safety is needed, it seems like I should be using something like ConcurrentHashMap for my cache, correct? -Tim > On May 1, 2018, at 8:07 AM, Bryan Bende <[email protected]> wrote: > > Tim, > > The reason the DMC works the way it does is because the cached data > needs to be shared across a cluster. For example, a processor like > DetectDuplicate needs to detect duplicates across all NiFi nodes and > not just the local node, or the same thing with Wait/Notify. > > In your case I don't think you have the need to share data across > nodes, so each NiFi node can have an instance of your controller > service which could have a HashMap as you described. > > You could definitely clear the map on enabled/disabled, and you could > also implement strategies based on time like if a cached value is > older than a certain threshold then remove it and re-fetch. It is > really up to how you use the services. > > I don't see any issues with memory as long as your cache doesn't grow > indefinitely. > > -Bryan > > > On Tue, May 1, 2018 at 6:47 AM, Otto Fowler <[email protected]> wrote: >> https://hc.apache.org/httpcomponents-client-ga/tutorial/html/caching.html ? >> >> >> On May 1, 2018 at 00:01:58, Tim Dean ([email protected]) wrote: >> >> Hello, >> >> I have a custom NiFi controller service that retrieves data from an external >> web service via HTTP requests. The results from these HTTP requests will be >> needed at various points throughout my process flow. In some situations, I >> could end up needing to access the HTTP response dozens or even hundreds of >> times. >> >> Given that the results of the HTTP request rarely change, I’d like them to >> be cached by my service and returned to my processors when needed. I’d need >> some way to explicitly clear the cache for those occasions when the data in >> the service does change. >> >> I’ve looked at using the DistributedMapCacheClientService implementation to >> cache my web service’s results, but it seems like that connects to a server >> via a socket connection and that doesn’t seem like it would be all that much >> more efficient than calling the web service directly. I’ve also looked at >> using the service’s state manager to store the results as state, but my data >> is a little more complex than what the documentation for state suggests is >> optimal: I don’t think my total map size will get to 1MB in size but it >> could be possible. >> >> Am I overthinking this? Would a simpler solution like creating a simple Java >> HashMap inside my controller service be adequate? I could empty the contents >> of the hash map whenever the controller services is enabled/disabled. Would >> the memory used by this kind of simplified local caching cause problems >> somewhere down the line? >> >> Are there other caching strategies I should be considering? >> >> Thanks >> >> -Tim >> >>
