Issue with empty files in content repository

Malthe Fri, 13 Sep 2019 02:16:08 -0700

Trying to figure out why a `MergeContent` processor was producing a
linearly rising amount of content which wasn't reaped correctly (the
retention policies would not be upheld and disk space would fall to
zero), we realized that some flow files in the queue pointed to
content which didn't exist on disk. The file in the content repository
was zero bytes.


How might this have happened and if it happens, shouldn't processors
somehow be able to recover from it?

What seems to happen is that the flow file goes right back into the
queue where it will of course fail again. Further, a simple grep seems
to show that references to the empty content file id appears in many
other files in the content repository. This seems to suggest that all
this content can't be reaped because there it is still being
referenced somehow and thus isn't applicable for archival and/or
deletion.

Thanks for any ideas.

Issue with empty files in content repository

Reply via email to