It seems as if the Funnel thing wasn't actually the problem. Here's my new timeline:
18:14... - Stop Processors 18:15:40 - Shutdown NiFi (graceful and successful) 18:28:03 - Starting NiFi which seemingly deletes content 18:31++ - Add Funnel etc. and start Processors again (so only now do I see the problem occurring even though it probably would have happened without it as well) I've uploaded the relevant part of the log here < http://pastebin.com/6XWP5SVF> All processors involved are custom processors but they don't do anything special and have been running for days and survived multiple restarts already. I can't share code now but if it becomes important I can strip them to a bare minimum and share. So when the failure happened it was even easier: CustomSourceProcessor was connected to CustomDestinationProcessor via a normal connection. Thanks yet again for helping out everyone! On Wed, Feb 17, 2016 at 5:04 AM, Aldrin Piri <[email protected]> wrote: > Lars, > > Are you able to share your flow or a template of it so we can try to > recreate? > > If not, could you give some information as to what it is doing and what > processors/components are involved. Are there any custom components? > > Thanks! > > On Tue, Feb 16, 2016 at 10:18 PM, Joe Witt <[email protected]> wrote: > > > 'that deletes the original file' > > > > True but even then that refers to the original source data and not > > what it is in the content repository itself. The content repository > > error that was emitted about missing flow file exception/content not > > found is for the purpose of signaling data was removed by some process > > outside of NiFi. > > > > Mark Payne: Any ideas? > > > > On Tue, Feb 16, 2016 at 10:15 PM, Thad Guidry <[email protected]> > > wrote: > > > There's a checkbox option in the FetchFile that deletes the original > > file. > > > > > > > > > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFile.java#L62 > > > > > > static final AllowableValue COMPLETION_DELETE = new > > AllowableValue("Delete > > > File", "Delete File", "Deletes the original file from the file > system"); > > > > > > > > > Perhaps its something along those lines, maybe in his other processors > ? > > > He mentioned "I also added another processor feeding that same funnel" > > ... > > > which processor was that exactly ? > > > > > > > > > Thad > > > +ThadGuidry <https://www.google.com/+ThadGuidry> > > > > > > On Tue, Feb 16, 2016 at 4:35 PM, Lars Francke <[email protected]> > > > wrote: > > > > > >> Thanks for the explanation. > > >> > > >> I tried to reproduce but I can't. I also looked through my bash > history > > and > > >> I can't find anything suspicious. I'm pretty sure that nothing deleted > > >> files in the content_repository that's not NiFi itself. Everything > else > > >> (logs etc.) are all untouched and some content files have survived as > > well. > > >> A few FlowFiles are being processed successfully and I just checked > the > > >> creation date of all files in content_repository. Most of them are > > "old". > > >> > > >> On Tue, Feb 16, 2016 at 11:12 PM, Joe Witt <[email protected]> > wrote: > > >> > > >> > Lars, > > >> > > > >> > The information you're providing from the logs is a pretty important > > >> > bit of debug data. > > >> > > > >> > This concept of 'CONTENTMISSING' being recorded into the Flow File > > >> > Repository is NiFI's way of saying "Hey I knew about this flow file > > >> > but when I tried to access the content it was no longer in the > content > > >> > repository". What I'm suggesting is something outside of NiFi > itself > > >> > removed the content. By default, even when you remove content using > > >> > the NiFi API it isn't actually deleting the content until it has to > > >> > and it is asynchronous. Even if you had restarted NiFi during this > I > > >> > don't see how this could occur. > > >> > > > >> > Even if you have some bugs in the custom processor implementations > the > > >> > issue you're showing here should not be possible. > > >> > > > >> > The only explanation that makes sense to me so far is that the > content > > >> > was actually deleted from within the content repository by something > > >> > other than NiFi. > > >> > > > >> > Can you reproduce the issue? > > >> > > > >> > Thanks > > >> > Joe > > >> > > > >> > On Tue, Feb 16, 2016 at 4:58 PM, Lars Francke < > [email protected] > > > > > >> > wrote: > > >> > > Any ideas on how to debug this further? > > >> > > > > >> > > I know very little about the internals of NiFi but there are > > obviously > > >> > > still references to that content and it shouldn't have been > deleted. > > >> Can > > >> > > you think of a way I could have done this by accident? > > >> > > > > >> > > On Tue, Feb 16, 2016 at 10:35 PM, Joe Witt <[email protected]> > > wrote: > > >> > > > > >> > >> Interesting. What that suggests is the content has been removed > > from > > >> > >> the content repo itself. > > >> > >> > > >> > >> Thanks > > >> > >> Joe > > >> > >> > > >> > >> On Tue, Feb 16, 2016 at 4:15 PM, Lars Francke < > > [email protected] > > >> > > > >> > >> wrote: > > >> > >> > I attached a debugger and checked a few of those FlowFiles that > > >> failed > > >> > >> and > > >> > >> > searched the logs for those. This is what I found: > > >> > >> > > > >> > >> > 2016-02-16 18:28:35,953 INFO [main] > > >> > >> o.a.n.c.repository.FileSystemRepository > > >> > >> > Found unknown file > > >> > >> > > > /Users/lars/Downloads/nifi-0.5.0/content_repository/103/14556368398 > > >> > >> > 47-103 (1058303 bytes) in File System Repository; archiving > file > > >> > >> > > > >> > >> > 2016-02-16 18:42:54,840 WARN [Timer-Driven Process Thread-9] > > >> > >> > o.a.n.c.r.WriteAheadFlowFileRepository Repository Record > > >> > >> > > > >> > >> > > >> > > > >> > > > StandardRepositoryRecord[UpdateType=CONTENTMISSING,Record=StandardFlowFileRecord[uuid=af69ca83-fc03-41f0-91e1-e3d65da54840,claim=StandardContentClaim > > >> > >> > [resourceClaim=StandardResourceClaim[id=1455636632024-102, > > >> > >> > container=default, section=102], offset=661978, > > >> > >> > length=10],offset=0,name=69321836993544,size=10]] is marked to > be > > >> > >> aborted; > > >> > >> > it will be persisted in the FlowFileRepository as a DELETE > record > > >> > >> > > > >> > >> > Now I can't remember having done this but it's entirely > possible > > >> that > > >> > I > > >> > >> > restarted NiFi prior to my experiment described above. > > >> > >> > > > >> > >> > > > >> > >> > On Tue, Feb 16, 2016 at 9:16 PM, Joe Witt <[email protected]> > > >> wrote: > > >> > >> > > > >> > >> >> Lars, > > >> > >> >> > > >> > >> >> Definitely look forward to understanding the mechanics here a > > bit > > >> > >> >> better of what you're seeing and if you can provide something > > >> > >> >> reproducible. Even if you have a custom processor the > > API/Process > > >> > >> >> Session construct should protect from many of the things that > > can > > >> go > > >> > >> >> wrong there. Now the content repo will likely be large empty > as > > >> the > > >> > >> >> data represents on 888KB of data and it is probably in a > > relative > > >> > >> >> small number of files on disk. > > >> > >> >> > > >> > >> >> Thanks > > >> > >> >> joe > > >> > >> >> > > >> > >> >> On Tue, Feb 16, 2016 at 2:57 PM, Lars Francke < > > >> > [email protected]> > > >> > >> >> wrote: > > >> > >> >> > Hi Matt, > > >> > >> >> > > > >> > >> >> > thanks for the quick response. It's late here so I'll try > > >> > reproducing > > >> > >> >> > tomorrow. > > >> > >> >> > > > >> > >> >> > Source and destination processors are custom processors. > > >> > >> >> > This is Nifi 0.5.0 RC3 > > >> > >> >> > > > >> > >> >> > NiFi thinks all FlowFiles are still there: < > > >> > http://imgur.com/isDlRk4> > > >> > >> >> > > > >> > >> >> > I'm looking at logs now no ERRORs or WARN that seem > > suspicious so > > >> > far > > >> > >> >> > > > >> > >> >> > On Tue, Feb 16, 2016 at 8:46 PM, Matthew Clarke < > > >> > >> >> [email protected]> > > >> > >> >> > wrote: > > >> > >> >> > > > >> > >> >> >> Lars, > > >> > >> >> >> What version of NiFi are you running? > > >> > >> >> >> What type of processor was your source processor? > > >> > >> >> >> What type of processor was the destination processor? > > >> > >> >> >> I tried reproducing using a GenerateFlowFile to > produce > > >> > ~100k > > >> > >> >> >> Flowfiles on a connection to an UpdateAttribute processor. > I > > >> then > > >> > >> >> stopped > > >> > >> >> >> the GenerateFlowFile processor , added a funnel, and moved > > the > > >> > >> >> connection. > > >> > >> >> >> I also added another processor feeding that same funnel and > > >> routed > > >> > >> the > > >> > >> >> >> connection from the funnel back to the UpdateAttribute > > >> processor. > > >> > >> The > > >> > >> >> >> files moved as expected through the funnnel. > > >> > >> >> >> > > >> > >> >> >> Can you reproduce? Any other errors in your app log > > from > > >> > >> prior > > >> > >> >> to > > >> > >> >> >> completing the connection? > > >> > >> >> >> > > >> > >> >> >> Thanks, > > >> > >> >> >> Matt > > >> > >> >> >> > > >> > >> >> >> On Tue, Feb 16, 2016 at 1:15 PM, Lars Francke < > > >> > >> [email protected]> > > >> > >> >> >> wrote: > > >> > >> >> >> > > >> > >> >> >> > Hi, > > >> > >> >> >> > > > >> > >> >> >> > I'm trying to understand what happened and how I can > > prevent > > >> > this > > >> > >> in > > >> > >> >> the > > >> > >> >> >> > future. > > >> > >> >> >> > > > >> > >> >> >> > The outcome seems to be that all my FlowFiles which were > > >> sitting > > >> > >> in a > > >> > >> >> >> > connection have been deleted from disk. > > >> > >> >> >> > > > >> > >> >> >> > I had a flow with two processors connected via a single > > >> > connection. > > >> > >> >> >> > > > >> > >> >> >> > What I did: > > >> > >> >> >> > * Stop all Processors > > >> > >> >> >> > * Add a Funnel > > >> > >> >> >> > * Add a Processor > > >> > >> >> >> > * Move destination end of existing connection to funnel > > (with > > >> > the > > >> > >> >> >> existing > > >> > >> >> >> > FlowFiles) > > >> > >> >> >> > * Connect new Processor to Funnel > > >> > >> >> >> > * Connect Funnel to old destination Processor > > >> > >> >> >> > > > >> > >> >> >> > The connection between the Funnel and the Destination > > >> processor > > >> > >> still > > >> > >> >> >> shows > > >> > >> >> >> > all 90k FlowFiles but the Processor fails on session.read > > >> with a > > >> > >> >> >> > MissingFlowFileException. > > >> > >> >> >> > > > >> > >> >> >> > Sure enough my content_repository is mostly empty too. > > >> > >> >> >> > > > >> > >> >> >> > Now this isn't so bad because it's only a dev environment > > but > > >> > I'd > > >> > >> >> like to > > >> > >> >> >> > understand how this could happen. Did I do something > wrong? > > >> > >> >> >> > > > >> > >> >> >> > Any hints on what to search for in the logs or which > place > > in > > >> > the > > >> > >> >> source > > >> > >> >> >> > code to look? > > >> > >> >> >> > > > >> > >> >> >> > Cheers, > > >> > >> >> >> > Lars > > >> > >> >> >> > > > >> > >> >> >> > > >> > >> >> > > >> > >> > > >> > > > >> > > >
