[ 
https://issues.apache.org/jira/browse/NIFI-6559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910571#comment-16910571
 ] 

Mark Payne commented on NIFI-6559:
----------------------------------

I don't think we can change something like this from within NiFi or from a 
separate utility. It would generate the same effect.

Can you better explain the end goal here? If I remember correctly, this was 
tied to a mailing list thread about OutOfMemoryError's. Was the desire to 
delete these files in order to sacrifice some of the data but not all? Or to 
avoid these particular updates because they were known to be particularly 
memory-intensive updates? Or something entirely different?

I could imagine perhaps having a utility that might purge data from a 
particular queue, or perhaps flowfiles that have attributes that exceed 65 KB 
or something like that... but we'd have to be super careful in a situation like 
that also because we'd have to ensure that we kept around a Set of all 
FlowFiles that were removed so that any further updates to those FlowFiles 
would not be included, etc.

> FlowFile Repo Journal Recovery Should not Fail if External Overflow Files are 
> Missing
> -------------------------------------------------------------------------------------
>
>                 Key: NIFI-6559
>                 URL: https://issues.apache.org/jira/browse/NIFI-6559
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Peter Wicks
>            Assignee: Peter Wicks
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When NiFi is journaling the FlowFile repository changes to disk it sometimes 
> writes Overflow files if it exceeds a certain memory threshold.
> These files are tracked inside of the *.journal files as External File 
> References. If one of these external file references is deleted or lost the 
> entire journal fails to recover.
> Instead, I feel this should work more like FlowFile's that lose their queue, 
> or Content in the Content Repository that has lost it's FlowFile.  Log it, 
> and move on.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to