Joe, If you wanted to go the route of truncating it, I would recommend starting with the nifi-toolkit-flowfile-repo module and update that. It has the dependencies all already in place to read the repository and update it. You would want to just read each transaction from a partition and write it to a new file until you hit the EOFException and then just discard that transaction.
The other option - not assuming that EOFException implies out of data would mean updating MinimalLockingWirteAheadLog (in the nifi-commons/nifi-write-ahead-log module) and then around lines 472-479 updating the logic so that if an Exception is caught there, we call nextPartition.getNextRecoverableTransactionId() again if the partition does actually have more data (may require adding some sort of isRecoveryDataAvailable() method or something like that on the Partition class). Does this help? Thanks -Mark On Sep 6, 2017, at 1:01 PM, Joe Gresock <[email protected]<mailto:[email protected]>> wrote: Sorry, 144 was a typo.. there are 14 files. Yes, it appears to have run out of disk space, so that's probably the root cause. Can you give my any ideas on how to carry out your two ideas? How would I look for the end of a record, so as to truncate it? On Wed, Sep 6, 2017 at 4:55 PM, Mark Payne <[email protected]<mailto:[email protected]>> wrote: Hmmm ok interesting... once it hits an EOFException it is assuming that there is no more data in the partition. Clearly, there is because it then fails when calling endRecovery(). Did you perhaps run out of disk space on your FlowFile Repo while it was running or hit an OutOfMemoryError? Perhaps that would cause an EOFException and then continue writing. The fact that there are 144 files in that directory is also very odd... there is generally only 1-2 files in that directory. Do all of your partitions have that many files? Any errors before the restart about not being able to checkpoint the FlowFile Repo? At this point, I'm not entirely sure what can be done, other than to perhaps try to manually truncate that last record in the Partition that is causing the EOFException. Or perhaps the MinimalLockingWriteAheadLog could be updated to not assume that EOFException implies that the partition no longer has data in it. Unfortunately, though, I'm not seeing any easy work around. On Sep 6, 2017, at 12:37 PM, Joe Gresock <[email protected]<mailto:[email protected]>> wrote: Yes, I do see: ERROR [main] org.wali.MinimalLockingWriteAheadLog org.wali.MinimalLockingWriteAheadLog@1e620fe7 unexpectedly reached End-of-File when reading from Partition-214 for Transaction ID 1918212626; assuming crash and ignoring this transaction. In that directory, I see 144 files, totalling ~120MB. The first two files are multi-megabyte files, and the other 12 are all either 7K or 4K. On Wed, Sep 6, 2017 at 4:30 PM, Mark Payne <[email protected]<mailto:[email protected]>> wrote: Joe, Any other errors in the logs? Specifically, looking for errors that contain the text: unexpectedly reached End-of-File when reading from or: unexpectedly found End-of-File when reading from This is not something that I've ever run into personally, but looking through the code, trying to understand what may cause this. Also, if you look at the files in /data/nifi/flowfile_ repository/partition-8, how many files are there in there, and how large are they? Thanks -Mark On Sep 6, 2017, at 12:22 PM, Joe Gresock <[email protected]<mailto:[email protected]><mailto:jgr [email protected]<mailto:[email protected]>>> wrote: 1.1.0, it's not on a system I can copy/paste from, but here's part of the stack trace: at org.wali.MinimalLockingWriteAheadLog$Partition.endRecovery( MinimalLockingWriteAheadLog.java:1047) ~[nifi-write-ahead-log-1.1.0.jar:1.1.0] at org.wali.MinimalLockingWriteAheadLog.recoverFromEdits( MinimalLockingWriteAheadLog.java:487) ~[nifi-write-ahead-log-1.1.0.jar:1.1.0] at org.wali.MinimalLockingWriteAheadLog.recoverRecords( MinimalLockingWriteAheadLog.java:301) ~[nifi-write-ahead-log-1.1.0.jar:1.1.0] On Wed, Sep 6, 2017 at 4:13 PM, Mark Payne <[email protected]<mailto:[email protected]> <mailto:m [email protected]<mailto:[email protected]>>> wrote: Joe, What version of NiFI are you running? Do you have a stack trace? Thanks -Mark On Sep 6, 2017, at 11:59 AM, Joe Gresock <[email protected]<mailto:[email protected]><mailto:jgr [email protected]<mailto:[email protected]>>> wrote: I'm wondering if there is a way to recover from this scenario: ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.IllegalStateException: Signaled end to recovery, but there are more recovery files for Partition in directory /data/nifi/flowfile_repository/partition-8 I have nearly a TB of files in my content_repository, so I'd really like to be able to salvage this node, but I'm not sure how to proceed, as the node won't start up. -- I know what it is to be in need, and I know what it is to have plenty. I have learned the secret of being content in any and every situation, whether well fed or hungry, whether living in plenty or in want. I can do all this through him who gives me strength. *-Philippians 4:12-13* -- I know what it is to be in need, and I know what it is to have plenty. I have learned the secret of being content in any and every situation, whether well fed or hungry, whether living in plenty or in want. I can do all this through him who gives me strength. *-Philippians 4:12-13* -- I know what it is to be in need, and I know what it is to have plenty. I have learned the secret of being content in any and every situation, whether well fed or hungry, whether living in plenty or in want. I can do all this through him who gives me strength. *-Philippians 4:12-13* -- I know what it is to be in need, and I know what it is to have plenty. I have learned the secret of being content in any and every situation, whether well fed or hungry, whether living in plenty or in want. I can do all this through him who gives me strength. *-Philippians 4:12-13*
