[ 
https://issues.apache.org/jira/browse/UNOMI-908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18022630#comment-18022630
 ] 

Serge Huber commented on UNOMI-908:
-----------------------------------

FYI here is the ticket about the cache unification : 
https://issues.apache.org/jira/browse/UNOMI-880

 

> Introduce Distributed Cache to avoid intensive polling
> ------------------------------------------------------
>
>                 Key: UNOMI-908
>                 URL: https://issues.apache.org/jira/browse/UNOMI-908
>             Project: Apache Unomi
>          Issue Type: Improvement
>          Components: unomi(-core)
>    Affects Versions: unomi-3.0.0
>            Reporter: Jerome Blanchard
>            Priority: Major
>
> h3. Context
> Currently, some Unomi entities (like rules, segment, propertyTypes...) are 
> polled in Elastic Search every seconds using Scheduled Job to ensure that if 
> a another Unomi node has made some modifications on it, it is locally 
> refreshed.
> This approach, while functioning, is not efficient moreover in case of low 
> frequency updates of the entity. 
> More than that, without a strong scheduler engine capable of watchdog and 
> failover, such a scheduler implementation can die silently causing invisible 
> integrity problems leading to corrupted data. We already faced similar 
> production issues having nodes with different rule set. 
> A second point is with the removal of Karaf Cellar (Karaf cluster bundle and 
> config propagation feature) in Unomi 3, another cluster topology monitoring 
> has been introduced. This new implementation rely on a dedicated entity : 
> ClusterNode stored in a dedicated index of elasticsearch.  
> Every 10 seconds (using also a scheduled job), the ClusterNode document is 
> updated by setting its heartbeat field to the current timestamp. In the same 
> time, other ClusterNode are checked to see if the latest heartbeat is fresh 
> enough to keep it or not.
> This topology management is very resource-intensive and does not follow state 
> of the art in terms of architecture. 
> Topology information is not something that need to be persisted (expect for 
> needs of audit which is not the case here) and must to be managed in memory 
> using dedicated and proven algorithm.
> Generally in Enterprise Application Servers or Frameworks (Jakarta EE, .net, 
> Spring) this kind of transversal and generic services are built in and 
> offered by the server avoiding the need of specific implementation.
> We may think about packaging it in a dedicated feature to fully decouples 
> Unomi logic from that one and to allow better isolated testing with specific 
> scenarios, outside of unomi.
> h3. Proposal
> The goal here is to propose a solution that will address both problems by 
> relying on an external, proven and widely used solution: distributed caching 
> with Infinispan.
> Because distributed caching libraries needs to rely on a cluster topology 
> manager inside, we could use the same tools for managing entities cache 
> without polling AND to discover and monitor the cluster topology.
> We propose to use a generic caching features packaged for Karaf embedding 
> Infinispan. It will be packaged as a dedicated generic caching service based 
> on annotated methods and directly inspired by the current Unomi entity cache.
> Thus, the underlying JGroups library used in Infinispan will also be exposed 
> to refactor the Unomi ClusterService instead of using a persistent entity.
> By externalizing caching into a dedicated, widely used and proven solution 
> the Unomi code will become lighter and more robust to manage cluster oriented 
> operations on entities.
> The use of a Distributed Cache for persistent entities is something that is 
> widely used for decades and integrated in all Enterprise Level framework 
> (EJB, ,Spring, ...) for a very long time. This is proven technology with very 
> strong implementation and support and Infinispan is one of the best reference 
> in that domain (used in Widlfy, Hibernate,  Apache Camel, ...)
> h3. Tasks
>  * Package a Unomi Cache feature that will rely on an embedded Infinispan
>  * Refactor ClusterServiceImpl to takes advantage of infinispan cluster 
> manger or simply store ClusterNode in the DistributedCache instead of in 
> ElasticSearch.
>  * Remove Elasticsearch-based persistence logic for ClusterNode.
>  * Ensure heartbeat updates are managed via distributed cache ; if not, rely 
> on the distributed cache underlying cluster management to manage ClusterNode 
> entities (JGroup for Infinispan)
>  * Remove Entity polling feature and use the distributed caching strategy for 
> the operations that loads entities in storage.
>  ** Current listRules() is refactor to simply load entities in ES but with 
> distributed caching.
>  ** updateRule() operation will also propagate the update in the distributed 
> cache avoiding any polling latency.
>  * Update documentation to reflect the new architecture.
> h3. Definition of Done
>  * ClusterNode information is available and updated without Elasticsearch.
>  * No additional Elasticsearch index is created for cluster nodes.
>  * Heartbeat mechanism works reliably.
>  * All 'cacheable' entities rely on the dedicated cluster aware cache feature 
> based on infinispan karaf feature.
>  * All polling jobs are removed
>  * Test for entities update propagation over cluster is setup
>  * All relevant documentation is updated.
>  * Integration tests confirm correct cluster node management and heartbeat 
> updates.
>  * No regression in cluster management functionality.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to