[
https://issues.apache.org/jira/browse/UNOMI-908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18022630#comment-18022630
]
Serge Huber commented on UNOMI-908:
-----------------------------------
FYI here is the ticket about the cache unification :
https://issues.apache.org/jira/browse/UNOMI-880
> Introduce Distributed Cache to avoid intensive polling
> ------------------------------------------------------
>
> Key: UNOMI-908
> URL: https://issues.apache.org/jira/browse/UNOMI-908
> Project: Apache Unomi
> Issue Type: Improvement
> Components: unomi(-core)
> Affects Versions: unomi-3.0.0
> Reporter: Jerome Blanchard
> Priority: Major
>
> h3. Context
> Currently, some Unomi entities (like rules, segment, propertyTypes...) are
> polled in Elastic Search every seconds using Scheduled Job to ensure that if
> a another Unomi node has made some modifications on it, it is locally
> refreshed.
> This approach, while functioning, is not efficient moreover in case of low
> frequency updates of the entity.
> More than that, without a strong scheduler engine capable of watchdog and
> failover, such a scheduler implementation can die silently causing invisible
> integrity problems leading to corrupted data. We already faced similar
> production issues having nodes with different rule set.
> A second point is with the removal of Karaf Cellar (Karaf cluster bundle and
> config propagation feature) in Unomi 3, another cluster topology monitoring
> has been introduced. This new implementation rely on a dedicated entity :
> ClusterNode stored in a dedicated index of elasticsearch.
> Every 10 seconds (using also a scheduled job), the ClusterNode document is
> updated by setting its heartbeat field to the current timestamp. In the same
> time, other ClusterNode are checked to see if the latest heartbeat is fresh
> enough to keep it or not.
> This topology management is very resource-intensive and does not follow state
> of the art in terms of architecture.
> Topology information is not something that need to be persisted (expect for
> needs of audit which is not the case here) and must to be managed in memory
> using dedicated and proven algorithm.
> Generally in Enterprise Application Servers or Frameworks (Jakarta EE, .net,
> Spring) this kind of transversal and generic services are built in and
> offered by the server avoiding the need of specific implementation.
> We may think about packaging it in a dedicated feature to fully decouples
> Unomi logic from that one and to allow better isolated testing with specific
> scenarios, outside of unomi.
> h3. Proposal
> The goal here is to propose a solution that will address both problems by
> relying on an external, proven and widely used solution: distributed caching
> with Infinispan.
> Because distributed caching libraries needs to rely on a cluster topology
> manager inside, we could use the same tools for managing entities cache
> without polling AND to discover and monitor the cluster topology.
> We propose to use a generic caching features packaged for Karaf embedding
> Infinispan. It will be packaged as a dedicated generic caching service based
> on annotated methods and directly inspired by the current Unomi entity cache.
> Thus, the underlying JGroups library used in Infinispan will also be exposed
> to refactor the Unomi ClusterService instead of using a persistent entity.
> By externalizing caching into a dedicated, widely used and proven solution
> the Unomi code will become lighter and more robust to manage cluster oriented
> operations on entities.
> The use of a Distributed Cache for persistent entities is something that is
> widely used for decades and integrated in all Enterprise Level framework
> (EJB, ,Spring, ...) for a very long time. This is proven technology with very
> strong implementation and support and Infinispan is one of the best reference
> in that domain (used in Widlfy, Hibernate, Apache Camel, ...)
> h3. Tasks
> * Package a Unomi Cache feature that will rely on an embedded Infinispan
> * Refactor ClusterServiceImpl to takes advantage of infinispan cluster
> manger or simply store ClusterNode in the DistributedCache instead of in
> ElasticSearch.
> * Remove Elasticsearch-based persistence logic for ClusterNode.
> * Ensure heartbeat updates are managed via distributed cache ; if not, rely
> on the distributed cache underlying cluster management to manage ClusterNode
> entities (JGroup for Infinispan)
> * Remove Entity polling feature and use the distributed caching strategy for
> the operations that loads entities in storage.
> ** Current listRules() is refactor to simply load entities in ES but with
> distributed caching.
> ** updateRule() operation will also propagate the update in the distributed
> cache avoiding any polling latency.
> * Update documentation to reflect the new architecture.
> h3. Definition of Done
> * ClusterNode information is available and updated without Elasticsearch.
> * No additional Elasticsearch index is created for cluster nodes.
> * Heartbeat mechanism works reliably.
> * All 'cacheable' entities rely on the dedicated cluster aware cache feature
> based on infinispan karaf feature.
> * All polling jobs are removed
> * Test for entities update propagation over cluster is setup
> * All relevant documentation is updated.
> * Integration tests confirm correct cluster node management and heartbeat
> updates.
> * No regression in cluster management functionality.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)