Joseph Percivall created NIFI-3273:
--------------------------------------
Summary: MinimalLockingWriteAheadLog doesn't properly handle
corrupted journals
Key: NIFI-3273
URL: https://issues.apache.org/jira/browse/NIFI-3273
Project: Apache NiFi
Issue Type: Bug
Reporter: Joseph Percivall
Priority: Critical
When NiFi is running if the system dies abruptly (sudden power loss) without
flushing writes then anything that was being written to disk can become
corrupted. A ticket for the provenance repository is already created here[1].
The content repo handles this automatically since the content claim won't be
valid if it hasn't been written out yet. The database repo is just a cache and
is rebuilt anyway. The logs are handled by logback. The flow.xml.gz can be
rolled back to one the last archive (manually).
This ticket is for the MinimalLockingWriteAheadLog which backs the FlowFile
repo and local state. Originally brought up here[2] for MiNiFi, it will also
affect NiFi.
One possible solution is to restore transactions up until the corrupted id and
then ignore the rest. This could cause state to become out of sync with the
processed flowfiles (if FF repo is restored but local state cannot be fully
restored) but given the rarity of the event I think it is an appropriate risk
to accept.
The workaround for the FF repo is to set "nifi.flowfile.repository.always.sync"
but currently there is no way to set "alway sync" for the local state provider.
[1] https://issues.apache.org/jira/browse/NIFI-2890
[2]
https://community.hortonworks.com/questions/75280/why-does-my-minifi-flow-fail-to-run-when-turning-o.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)