Hello,

Thanks for reporting the issues with the Volatile Content & FlowFile 
Repositories. These definitely sounds like bugs. Do you mind filing a Jira [1] 
for these?
If you’d like to store everything in memory, though, my recommendation would 
honestly be to use a RAM disk rather than the volatile repositories, since the 
standard repositories are much more widely used and therefore extremely well 
tested, while the volatile implementations are much less so.

In terms of the last issue, in which NiFi becomes stuck after a week: what 
version of NiFi are you running? I would recommend gathering a thread dump 
(bin/nifi.sh dump dump1.txt) when this occurs and providing the thread dump so 
that it can be analyzed to determine what’s happening.

Thanks
-Mark


[1] https://issues.apache.org/jira/projects/NIFI



On Apr 21, 2020, at 11:35 PM, abcfd abcdd 
<[email protected]<mailto:[email protected]>> wrote:

Hola,

Our team has worked with NiFi for over one year. Our scenario is dealing with 
3-5 billion data using NiFi, we found that WriteAheadFlowFileRepository and 
FileSystemRepository cannot meet command,so we put data which need to be 
consumed in tmpfs and choose VolatileFlowFileRepository and 
VolatileContentRepository to reduce I/O costs and avoid WAL, because in our 
scenario, the data can be thrown away when backpressure occurs or NiFi 
restarted.

But, we find three problems working with VolatileFlowFileRepository and 
VolatileContentRepository.
1. VolatileContentRepository
when maxSize = 100MB and blockSize = 2KB, there should be 51200 "slots". If we 
write one kb by one kb, 102400 one kb should be written in, but when writing 
51201th one kb, "java.io.IOException: Content Repository is out of space" 
occurs. Here's the Junit Test I write.

@Test
public void test() throws IOException {
    System.setProperty(NiFiProperties.PROPERTIES_FILE_PATH, 
TestVolatileContentRepository.class.getResource("/conf/nifi.properties").getFile());
    final Map<String, String> addProps = new HashMap<>();
    addProps.put(VolatileContentRepository.BLOCK_SIZE_PROPERTY, "2 KB");
    final NiFiProperties nifiProps = 
NiFiProperties.createBasicNiFiProperties(null, addProps);
    final VolatileContentRepository contentRepo = new 
VolatileContentRepository(nifiProps);
    contentRepo.initialize(claimManager);
    // can write 100 * 1024 /1 = 102400, but after 51201, blocks exhausted
    for (int idx =0; idx < 51201; ++idx) {
        final ContentClaim claim = contentRepo.create(true);
        try (final OutputStream out = contentRepo.write(claim)){
            final byte[] oneK = new byte[1024];
            Arrays.fill(oneK, (byte) 55);

            out.write(oneK);
        }
    }
}

2. VolatileFlowFileRepository
When the backpressure occurs, FileSystemSwapManager will swap out FlowFiles to 
disk whenever swapQueue size exceeds 10000,  there's no problem in swap-out 
process BUT in swap-in process, VolatileFlowFileRepository does not 
"acknowledge" the FlowFiles which has been swap out when FileSystemSwapManager 
swaps in FlowFiles from the disk and logs the warning information "Cannot swap 
in FlowFiles from location..." because the implementation of 
"isValidSwapLocationSuffix" in VolatileFlowFileRepository is always FALSE.
And the queue is still like FULL when checking the NiFi frontend, the upstream 
processor is STUCKED, maybe FileSystemSwapManager "thinks" these FlowFiles are 
still not consumed.

3. we found that NiFi cannot live more than a week even if we use 
WriteAheadFlowFileRepository and FileSystemRepository. NiFi stucked,  didn't 
process any data and there was no output in nifi-app.log. We restart NiFi and 
it is back to normal, but we didn't know what happened.

Muchas Gracias

Reply via email to