Benjamin BONNET created CAMEL-17157:
---------------------------------------

             Summary: AggregateProcessor, TimeoutMap Restoration and Cluster
                 Key: CAMEL-17157
                 URL: https://issues.apache.org/jira/browse/CAMEL-17157
             Project: Camel
          Issue Type: Bug
          Components: camel-core
    Affects Versions: 3.12.0, 3.11.3
            Reporter: Benjamin BONNET


Hi,
Consider an aggregate having completion timeout and backed by a persistent 
repository (e.g. JBCAggregationRepository). When route starts, there is an 
invocation to     restoreTimeoutMapFromAggregationRepositonry()  
(AggregatorProcessor, line 877). That method consists in :
# getting all keys of pending aggregations (i.e. aggregation that were not yet 
completed when route stopped)
# iterate on each key to get each row and put row timeout into timeoutmap.
That works fine when there is only one instance, but if you deploy on a 
cluster, things may go wrong.
As a matter of fact, if one instance is warming-up while another is modifying 
repository, warm-up may fail (NullPointerException) : that occurs when a row 
has been deleted (because aggregation was completed by a running instance) 
between 1. and 2. 
One can imagine another less noisy failure : a row is created by a running 
instance between 1. and 2. . Then warming-up does not complain, but the new row 
will not be included in timeout map, which may be an issue if the instance that 
inserted that row into the repo is stopped before completion (timeout will not 
be detected).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to