Hi Chris,

thanks a lot for you information.

I was not aware that "DistributedMapCacheServer" should not be used in production. Maybe a short hint in the Controller Service Documentation would be helpful also for other users.

Pointing to RedisDistributedMapCacheClientService lead us to the decision using Redis in the future for distributed caching data (used it he first time now). What type of Redis persistence type to be used (RDB and/or AOF) would be important to handle data loss vs. performance.

In general i would like to say thank you to all the people who constantly develop the NiFi ecosystem! Well done.

regards,
Jörg


On 2022/10/14 16:22:14 Chris Sampson wrote:
> The DistributedMapCacheServer is, I believe, meant as a reference
> implementation of the service to be used as an example rather than in
> production. The kind of scenario you describe is exactly the reason to not
> use this in-memory (optionally locally persisted on disk) in a clustered
> production environment.
>
> That said, it can be used if the use case of the Flow doesn't have problems
> if a node goes offline, etc.
>
> The recommended approach is to use an external service such as Redis with
> the RedisDistributedMapCacheClientService [1]. This can interface with your > external Redis cluster/instance using the same API. Other external services
> can be used, see the selection of related Controller Services in the nifi
> docs [2] (e.g. search for "cache").
>
> [1]:
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-redis-nar/1.18.0/org.apache.nifi.redis.service.RedisDistributedMapCacheClientService/index.html
>
> [2]: https://nifi.apache.org/docs/nifi-docs
>
> On Fri, 14 Oct 2022, 17:13 Jörg Hammerbacher, <[email protected]> wrote:
>
> > Hi,
> >
> >
> > I have one thing where i am looking for a solution. Maybe someone can
> > help me out or give me hint how to do.
> >
> >
> > Problem:
> >
> > I often use a NiFi Clusters with "DistributedMapCacheClientService"
> > which uses a "DistributedMapCacheServer" for cluster wide key/value
> > storage. Per default the DMCS uses "in memory" and sockets for
> > synchronization. We use a persistence directory to make the data
> > persistent and to avoid that the data is gone after restarting the
> > entire cluster. But in the case, if the primary node changes, i think
> > the data will be outdated or used from a potential outdated other node.
> > If this other Node takes the primary node role, old data will be used
> > for next FecthDistrubutedMapCache. The latest updates over the old
> > primar node are gone.
> >
> > Is there a service using e.g. zookeeper "int the backgroud" to get a
> > real distributed persitent Cache - even after restarting the entire
> > cluster / all nodes?
> >
> >
> > I know, the standard cache is able to provide a hugh frequent
> > read/update servise if the data is in memory. But if we need just one or
> > max a few updates per minute ...
> >
> > Yes, using another system like a Database (as persistent singleton) can
> > be a solution - a not really matching solution. Why is there no standard > > service in NiFi for this? Isn't it a good idea or i am the only one with
> > this problem in the past?
> >
> >
> > Thanks in advance for answers,
> >
> > Jörg (Hammerbacher)
> >
> >
> >
>

--
mit freundlichen Grüßen,
Jörg Hammerbacher
http://www.hammerbacher-it.de
[email protected]

Reply via email to