Timur created CAMEL-20850:
-----------------------------

             Summary: LRUCache evicts entries unexpectedly
                 Key: CAMEL-20850
                 URL: https://issues.apache.org/jira/browse/CAMEL-20850
             Project: Camel
          Issue Type: Bug
            Reporter: Timur


h1. Summary

We encountered an infinite loop while downloading files using the SFTP endpoint.
During our investigation, we discovered the {{InProgressRepository}} utilizing 
the {{SimpleLRUCache}}. This cache incorporates an unconventional eviction 
implementation that involves a changes tracking queue.

Any attempt to modify the cache, even when putting the same element, results in 
an increase in the cache size, potentially causing eviction to occur even for a 
single element.
The code is as follows:
{code:java}
Map<Object, Object> lruCache = LRUCacheFactory.newLRUCache(1, 1);
int numberOfInserts = 0;
lruCache.put("key", "value");
while  (lruCache.size() > 0) {
      lruCache.put("key", "value");
      numberOfInserts++;
}
throw new IllegalStateException("We should never reach this point, we inserted 
the same element " + numberOfInserts + " times");
{code}

In the provided code, the failure occurs when the number of inserts reaches 2.

h1. Expected Result
Utilize the LRU Cache limited only by the maximum size passed into it, rather 
than by some internal implementation limits.

h1. Proposed Solution
Implement a standard LRUCache, say, by levereging a {{LRUMap}} from Apache 
Commons.
We may provide the PR later on.

h1. The Background
We need to use a dynamic SFTP route (the download parameters depend on some 
configuration passed from the outside). By default, this is impossible, so we 
will have to use {{pollEnrich}}:
{code:java}
.pollEnrich().simple(SFTP_DYNAMIC_URI)
             .timeout(RETURN_CURRENT_RESULT_WITHOUT_WAIT)
{code}

The pollEnrich takes only one element (see {{GenericFilePollingConsumer}}). The 
file consumer requests from the SFTP endpoint the list of all files, then takes 
the first, then gets the list again and takes the second file, and so on.

The combination of {{InProgressRepository}} and idempotent consumer is used to 
avoid handling the same files again.
However, every time the SFTP endpoint lists files, the endpoints and polling 
consumers adjust the list in the {{InProgressRepository}}, thus affecting the 
{{SimpleLRUCache}}.
The number of elements in the cache doesn't grow, however, the number of 
changes grows with almost n^2 speed.

With the limit of 50k files, the SimpleLRUCache is evicted already on ~600(!) 
files. This leads to an infinite loop while processing the files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to