Thanks Mark, I have posted these issues in Jira. https://issues.apache.org/jira/projects/NIFI/issues/NIFI-7388?filter=allissues
Your suggestion is very constructive, we have switched back to WriteAheadFlowFileRepository and FileSystemRepository temporarily, looking forward to a new version which can fix these issues. We are running NiFi-1.11.4. The environment is jdk8_141, docker 18.04.0-ce and Kubernetes v1.13.3. We doubt that it may be related to CodeCache in JVM, so we have increase CodeCacheSize to 256MB, using CodeCacheFlushing. We will continue to pay attention to this issue and gather a thread dump when stuck occurs according to your suggestions. ZhangXinchen On 2020/04/22 14:51:51, Mark Payne <[email protected]> wrote: > Hello, > > Thanks for reporting the issues with the Volatile Content & FlowFile > Repositories. These definitely sounds like bugs. Do you mind filing a Jira > [1] for these? > If you’d like to store everything in memory, though, my recommendation would > honestly be to use a RAM disk rather than the volatile repositories, since > the standard repositories are much more widely used and therefore extremely > well tested, while the volatile implementations are much less so. > > In terms of the last issue, in which NiFi becomes stuck after a week: what > version of NiFi are you running? I would recommend gathering a thread dump > (bin/nifi.sh dump dump1.txt) when this occurs and providing the thread dump > so that it can be analyzed to determine what’s happening. > > Thanks > -Mark > > > [1] https://issues.apache.org/jira/projects/NIFI > > > > On Apr 21, 2020, at 11:35 PM, abcfd abcdd > <[email protected]<mailto:[email protected]>> wrote: > > Hola, > > Our team has worked with NiFi for over one year. Our scenario is dealing with > 3-5 billion data using NiFi, we found that WriteAheadFlowFileRepository and > FileSystemRepository cannot meet command,so we put data which need to be > consumed in tmpfs and choose VolatileFlowFileRepository and > VolatileContentRepository to reduce I/O costs and avoid WAL, because in our > scenario, the data can be thrown away when backpressure occurs or NiFi > restarted. > > But, we find three problems working with VolatileFlowFileRepository and > VolatileContentRepository. > 1. VolatileContentRepository > when maxSize = 100MB and blockSize = 2KB, there should be 51200 "slots". If > we write one kb by one kb, 102400 one kb should be written in, but when > writing 51201th one kb, "java.io.IOException: Content Repository is out of > space" occurs. Here's the Junit Test I write. > > @Test > public void test() throws IOException { > System.setProperty(NiFiProperties.PROPERTIES_FILE_PATH, > TestVolatileContentRepository.class.getResource("/conf/nifi.properties").getFile()); > final Map<String, String> addProps = new HashMap<>(); > addProps.put(VolatileContentRepository.BLOCK_SIZE_PROPERTY, "2 KB"); > final NiFiProperties nifiProps = > NiFiProperties.createBasicNiFiProperties(null, addProps); > final VolatileContentRepository contentRepo = new > VolatileContentRepository(nifiProps); > contentRepo.initialize(claimManager); > // can write 100 * 1024 /1 = 102400, but after 51201, blocks exhausted > for (int idx =0; idx < 51201; ++idx) { > final ContentClaim claim = contentRepo.create(true); > try (final OutputStream out = contentRepo.write(claim)){ > final byte[] oneK = new byte[1024]; > Arrays.fill(oneK, (byte) 55); > > out.write(oneK); > } > } > } > > 2. VolatileFlowFileRepository > When the backpressure occurs, FileSystemSwapManager will swap out FlowFiles > to disk whenever swapQueue size exceeds 10000, there's no problem in > swap-out process BUT in swap-in process, VolatileFlowFileRepository does not > "acknowledge" the FlowFiles which has been swap out when > FileSystemSwapManager swaps in FlowFiles from the disk and logs the warning > information "Cannot swap in FlowFiles from location..." because the > implementation of "isValidSwapLocationSuffix" in VolatileFlowFileRepository > is always FALSE. > And the queue is still like FULL when checking the NiFi frontend, the > upstream processor is STUCKED, maybe FileSystemSwapManager "thinks" these > FlowFiles are still not consumed. > > 3. we found that NiFi cannot live more than a week even if we use > WriteAheadFlowFileRepository and FileSystemRepository. NiFi stucked, didn't > process any data and there was no output in nifi-app.log. We restart NiFi and > it is back to normal, but we didn't know what happened. > > Muchas Gracias > >
