[ 
https://issues.apache.org/jira/browse/CAMEL-20850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timur updated CAMEL-20850:
--------------------------
    Description: 
h1. Summary

We encountered an infinite loop while downloading files using the SFTP endpoint.
During our investigation, we discovered the {{InProgressRepository}} utilizing 
the {{SimpleLRUCache}}. This cache incorporates an unconventional eviction 
implementation that involves a changes tracking queue.

Any attempt to modify the cache, even when putting the same element, results in 
an increase in the cache size, potentially causing eviction to occur even for a 
single element.
The code is as follows:
{code:java}
Map<Object, Object> lruCache = LRUCacheFactory.newLRUCache(1, 1);
int numberOfInserts = 0;
lruCache.put("key", "value");
while  (lruCache.size() > 0) {
      lruCache.put("key", "value");
      numberOfInserts++;
}
throw new IllegalStateException("We should never reach this point, we inserted 
the same element " + numberOfInserts + " times");
{code}

In the provided code, the failure occurs when the number of inserts reaches 2.

h1. Expected Result
Utilize the LRU Cache limited only by the maximum size passed into it, rather 
than by some internal implementation limits.

h1. Proposed Solution
Implement a standard LRUCache, say, by leveraging a {{LRUMap}} from Apache 
Commons.
We may provide the PR later on.

h1. The Background
We need to use a dynamic SFTP route (the download parameters depend on some 
configuration passed from the outside). By default, this is impossible, so we 
will have to use {{pollEnrich}}:
{code:java}
.pollEnrich().simple(SFTP_DYNAMIC_URI)
             .timeout(RETURN_CURRENT_RESULT_WITHOUT_WAIT)
{code}

The pollEnrich takes only one element (see {{GenericFilePollingConsumer}}). The 
file consumer requests from the SFTP endpoint the list of all files, then takes 
the first, then gets the list again and takes the second file, and so on.

The combination of {{InProgressRepository}} and idempotent consumer is used to 
avoid handling the same files again.
However, every time the SFTP endpoint lists files, the endpoints and polling 
consumers adjust the list in the {{InProgressRepository}}, thus affecting the 
{{SimpleLRUCache}}.
The number of elements in the cache doesn't grow, however, the number of 
changes grows with almost n^2 speed.

With the limit of 50k files, the SimpleLRUCache is evicted already on ~600(!) 
files. This leads to an infinite loop while processing the files.

  was:
h1. Summary

We encountered an infinite loop while downloading files using the SFTP endpoint.
During our investigation, we discovered the {{InProgressRepository}} utilizing 
the {{SimpleLRUCache}}. This cache incorporates an unconventional eviction 
implementation that involves a changes tracking queue.

Any attempt to modify the cache, even when putting the same element, results in 
an increase in the cache size, potentially causing eviction to occur even for a 
single element.
The code is as follows:
{code:java}
Map<Object, Object> lruCache = LRUCacheFactory.newLRUCache(1, 1);
int numberOfInserts = 0;
lruCache.put("key", "value");
while  (lruCache.size() > 0) {
      lruCache.put("key", "value");
      numberOfInserts++;
}
throw new IllegalStateException("We should never reach this point, we inserted 
the same element " + numberOfInserts + " times");
{code}

In the provided code, the failure occurs when the number of inserts reaches 2.

h1. Expected Result
Utilize the LRU Cache limited only by the maximum size passed into it, rather 
than by some internal implementation limits.

h1. Proposed Solution
Implement a standard LRUCache, say, by levereging a {{LRUMap}} from Apache 
Commons.
We may provide the PR later on.

h1. The Background
We need to use a dynamic SFTP route (the download parameters depend on some 
configuration passed from the outside). By default, this is impossible, so we 
will have to use {{pollEnrich}}:
{code:java}
.pollEnrich().simple(SFTP_DYNAMIC_URI)
             .timeout(RETURN_CURRENT_RESULT_WITHOUT_WAIT)
{code}

The pollEnrich takes only one element (see {{GenericFilePollingConsumer}}). The 
file consumer requests from the SFTP endpoint the list of all files, then takes 
the first, then gets the list again and takes the second file, and so on.

The combination of {{InProgressRepository}} and idempotent consumer is used to 
avoid handling the same files again.
However, every time the SFTP endpoint lists files, the endpoints and polling 
consumers adjust the list in the {{InProgressRepository}}, thus affecting the 
{{SimpleLRUCache}}.
The number of elements in the cache doesn't grow, however, the number of 
changes grows with almost n^2 speed.

With the limit of 50k files, the SimpleLRUCache is evicted already on ~600(!) 
files. This leads to an infinite loop while processing the files.


> LRUCache evicts entries unexpectedly
> ------------------------------------
>
>                 Key: CAMEL-20850
>                 URL: https://issues.apache.org/jira/browse/CAMEL-20850
>             Project: Camel
>          Issue Type: Bug
>            Reporter: Timur
>            Priority: Critical
>
> h1. Summary
> We encountered an infinite loop while downloading files using the SFTP 
> endpoint.
> During our investigation, we discovered the {{InProgressRepository}} 
> utilizing the {{SimpleLRUCache}}. This cache incorporates an unconventional 
> eviction implementation that involves a changes tracking queue.
> Any attempt to modify the cache, even when putting the same element, results 
> in an increase in the cache size, potentially causing eviction to occur even 
> for a single element.
> The code is as follows:
> {code:java}
> Map<Object, Object> lruCache = LRUCacheFactory.newLRUCache(1, 1);
> int numberOfInserts = 0;
> lruCache.put("key", "value");
> while  (lruCache.size() > 0) {
>       lruCache.put("key", "value");
>       numberOfInserts++;
> }
> throw new IllegalStateException("We should never reach this point, we 
> inserted the same element " + numberOfInserts + " times");
> {code}
> In the provided code, the failure occurs when the number of inserts reaches 2.
> h1. Expected Result
> Utilize the LRU Cache limited only by the maximum size passed into it, rather 
> than by some internal implementation limits.
> h1. Proposed Solution
> Implement a standard LRUCache, say, by leveraging a {{LRUMap}} from Apache 
> Commons.
> We may provide the PR later on.
> h1. The Background
> We need to use a dynamic SFTP route (the download parameters depend on some 
> configuration passed from the outside). By default, this is impossible, so we 
> will have to use {{pollEnrich}}:
> {code:java}
> .pollEnrich().simple(SFTP_DYNAMIC_URI)
>              .timeout(RETURN_CURRENT_RESULT_WITHOUT_WAIT)
> {code}
> The pollEnrich takes only one element (see {{GenericFilePollingConsumer}}). 
> The file consumer requests from the SFTP endpoint the list of all files, then 
> takes the first, then gets the list again and takes the second file, and so 
> on.
> The combination of {{InProgressRepository}} and idempotent consumer is used 
> to avoid handling the same files again.
> However, every time the SFTP endpoint lists files, the endpoints and polling 
> consumers adjust the list in the {{InProgressRepository}}, thus affecting the 
> {{SimpleLRUCache}}.
> The number of elements in the cache doesn't grow, however, the number of 
> changes grows with almost n^2 speed.
> With the limit of 50k files, the SimpleLRUCache is evicted already on ~600(!) 
> files. This leads to an infinite loop while processing the files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to