[jira] [Updated] (NIFI-3582) DistributedMapCacheServer does not evict persistent records

Joseph Gresock (JIRA) Thu, 09 Mar 2017 07:47:05 -0800

     [ 
https://issues.apache.org/jira/browse/NIFI-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Joseph Gresock updated NIFI-3582:
---------------------------------
    Description: 
The PersistentMapCache does not appear to be evicting records when used by the 
DistributedMapCacheServer, at least in the FIFO eviction mode.  

To replicate this behavior in NiFi 1.1.1, add the following to your flow:

*Services*
* DistributedMapCacheServer, port 4557, maximum cache entries = 10,000, FIFO 
eviction, persistence directory specified
* DistributedMapCacheClientService, point to the same host and port

*Flow*
GenerateFlowFile (randomize 1K binary files in batches of 10, schedule 10 
threads) -> HashContent (md5) into hash.value -> DetectDuplicate with 
identifier = $\{hash.value\}, description = ., no age off, select your cache 
client, cache identifier = true

This should cause the snapshot file in your cache's persistence directory to 
exceed 100,000 keys pretty quickly, and as far as I can tell, it never goes 
back down.  

On our production system, we have the cache server configured for 100,000 max 
entries with FIFO eviction, and we recently saw this log statement showing the 
writeahead log with over 4 million entries:
{code}
nifi-app.log:2017-03-09 15:03:00,670 INFO [Distributed Cache Server 
Communications Thread: ac907dec-49a4-439e-99f5-1558f2358d87] 
org.wali.MinimalLockingWriteAheadLog 
org.wali.MinimalLockingWriteAheadLog@40569408 
checkpointed with 4262902 Records and 0 Swap Files in 256302 milliseconds 
(Stop-the-world time = 1378 milliseconds, Clear Edit Logs time = 19 millis), 
max Transaction ID 4263237
{code}

  was:
The PersistentMapCache does not appear to be evicting records when used by the 
DistributedMapCacheServer, at least in the FIFO eviction mode.  

To replicate this behavior in NiFi 1.1.1, add the following to your flow:

*Services*
*DistributedMapCacheServer, port 4557, maximum cache entries = 10,000, FIFO 
eviction, persistence directory specified
*DistributedMapCacheClientService, point to the same host and port

*Flow*
GenerateFlowFile (randomize 1K binary files in batches of 10, schedule 10 
threads) -> HashContent (md5) into hash.value -> DetectDuplicate with 
identifier = $\{hash.value\}, description = ., no age off, select your cache 
client, cache identifier = true

This should cause the snapshot file in your cache's persistence directory to 
exceed 100,000 keys pretty quickly, and as far as I can tell, it never goes 
back down.  

On our production system, we have the cache server configured for 100,000 max 
entries with FIFO eviction, and we recently saw this log statement showing the 
writeahead log with over 4 million entries:
{code}
nifi-app.log:2017-03-09 15:03:00,670 INFO [Distributed Cache Server 
Communications Thread: ac907dec-49a4-439e-99f5-1558f2358d87] 
org.wali.MinimalLockingWriteAheadLog 
org.wali.MinimalLockingWriteAheadLog@40569408 
checkpointed with 4262902 Records and 0 Swap Files in 256302 milliseconds 
(Stop-the-world time = 1378 milliseconds, Clear Edit Logs time = 19 millis), 
max Transaction ID 4263237
{code}


> DistributedMapCacheServer does not evict persistent records
> -----------------------------------------------------------
>
>                 Key: NIFI-3582
>                 URL: https://issues.apache.org/jira/browse/NIFI-3582
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.1.1
>            Reporter: Joseph Gresock
>            Assignee: Mark Payne
>
> The PersistentMapCache does not appear to be evicting records when used by 
> the DistributedMapCacheServer, at least in the FIFO eviction mode.  
> To replicate this behavior in NiFi 1.1.1, add the following to your flow:
> *Services*
> * DistributedMapCacheServer, port 4557, maximum cache entries = 10,000, FIFO 
> eviction, persistence directory specified
> * DistributedMapCacheClientService, point to the same host and port
> *Flow*
> GenerateFlowFile (randomize 1K binary files in batches of 10, schedule 10 
> threads) -> HashContent (md5) into hash.value -> DetectDuplicate with 
> identifier = $\{hash.value\}, description = ., no age off, select your cache 
> client, cache identifier = true
> This should cause the snapshot file in your cache's persistence directory to 
> exceed 100,000 keys pretty quickly, and as far as I can tell, it never goes 
> back down.  
> On our production system, we have the cache server configured for 100,000 max 
> entries with FIFO eviction, and we recently saw this log statement showing 
> the writeahead log with over 4 million entries:
> {code}
> nifi-app.log:2017-03-09 15:03:00,670 INFO [Distributed Cache Server 
> Communications Thread: ac907dec-49a4-439e-99f5-1558f2358d87] 
> org.wali.MinimalLockingWriteAheadLog 
> org.wali.MinimalLockingWriteAheadLog@40569408 
> checkpointed with 4262902 Records and 0 Swap Files in 256302 milliseconds 
> (Stop-the-world time = 1378 milliseconds, Clear Edit Logs time = 19 millis), 
> max Transaction ID 4263237
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (NIFI-3582) DistributedMapCacheServer does not evict persistent records

Reply via email to