Thanks for the explanation.

I tried to reproduce but I can't. I also looked through my bash history and
I can't find anything suspicious. I'm pretty sure that nothing deleted
files in the content_repository that's not NiFi itself. Everything else
(logs etc.) are all untouched and some content files have survived as well.
A few FlowFiles are being processed successfully and I just checked the
creation date of all files in content_repository. Most of them are "old".

On Tue, Feb 16, 2016 at 11:12 PM, Joe Witt <[email protected]> wrote:

> Lars,
>
> The information you're providing from the logs is a pretty important
> bit of debug data.
>
> This concept of 'CONTENTMISSING' being recorded into the Flow File
> Repository is NiFI's way of saying "Hey I knew about this flow file
> but when I tried to access the content it was no longer in the content
> repository".  What I'm suggesting is something outside of NiFi itself
> removed the content.  By default, even when you remove content using
> the NiFi API it isn't actually deleting the content until it has to
> and it is asynchronous.  Even if you had restarted NiFi during this I
> don't see how this could occur.
>
> Even if you have some bugs in the custom processor implementations the
> issue you're showing here should not be possible.
>
> The only explanation that makes sense to me so far is that the content
> was actually deleted from within the content repository by something
> other than NiFi.
>
> Can you reproduce the issue?
>
> Thanks
> Joe
>
> On Tue, Feb 16, 2016 at 4:58 PM, Lars Francke <[email protected]>
> wrote:
> > Any ideas on how to debug this further?
> >
> > I know very little about the internals of NiFi but there are obviously
> > still references to that content and it shouldn't have been deleted. Can
> > you think of a way I could have done this by accident?
> >
> > On Tue, Feb 16, 2016 at 10:35 PM, Joe Witt <[email protected]> wrote:
> >
> >> Interesting.  What that suggests is the content has been removed from
> >> the content repo itself.
> >>
> >> Thanks
> >> Joe
> >>
> >> On Tue, Feb 16, 2016 at 4:15 PM, Lars Francke <[email protected]>
> >> wrote:
> >> > I attached a debugger and checked a few of those FlowFiles that failed
> >> and
> >> > searched the logs for those. This is what I found:
> >> >
> >> > 2016-02-16 18:28:35,953 INFO [main]
> >> o.a.n.c.repository.FileSystemRepository
> >> > Found unknown file
> >> > /Users/lars/Downloads/nifi-0.5.0/content_repository/103/14556368398
> >> > 47-103 (1058303 bytes) in File System Repository; archiving file
> >> >
> >> > 2016-02-16 18:42:54,840 WARN [Timer-Driven Process Thread-9]
> >> > o.a.n.c.r.WriteAheadFlowFileRepository Repository Record
> >> >
> >>
> StandardRepositoryRecord[UpdateType=CONTENTMISSING,Record=StandardFlowFileRecord[uuid=af69ca83-fc03-41f0-91e1-e3d65da54840,claim=StandardContentClaim
> >> > [resourceClaim=StandardResourceClaim[id=1455636632024-102,
> >> > container=default, section=102], offset=661978,
> >> > length=10],offset=0,name=69321836993544,size=10]] is marked to be
> >> aborted;
> >> > it will be persisted in the FlowFileRepository as a DELETE record
> >> >
> >> > Now I can't remember having done this but it's entirely possible that
> I
> >> > restarted NiFi prior to my experiment described above.
> >> >
> >> >
> >> > On Tue, Feb 16, 2016 at 9:16 PM, Joe Witt <[email protected]> wrote:
> >> >
> >> >> Lars,
> >> >>
> >> >> Definitely look forward to understanding the mechanics here a bit
> >> >> better of what you're seeing and if you can provide something
> >> >> reproducible.  Even if you have a custom processor the API/Process
> >> >> Session construct should protect from many of the things that can go
> >> >> wrong there.  Now the content repo will likely be large empty as the
> >> >> data represents on 888KB of data and it is probably in a relative
> >> >> small number of files on disk.
> >> >>
> >> >> Thanks
> >> >> joe
> >> >>
> >> >> On Tue, Feb 16, 2016 at 2:57 PM, Lars Francke <
> [email protected]>
> >> >> wrote:
> >> >> > Hi Matt,
> >> >> >
> >> >> > thanks for the quick response. It's late here so I'll try
> reproducing
> >> >> > tomorrow.
> >> >> >
> >> >> > Source and destination processors are custom processors.
> >> >> > This is Nifi 0.5.0 RC3
> >> >> >
> >> >> > NiFi thinks all FlowFiles are still there: <
> http://imgur.com/isDlRk4>
> >> >> >
> >> >> > I'm looking at logs now no ERRORs or WARN that seem suspicious so
> far
> >> >> >
> >> >> > On Tue, Feb 16, 2016 at 8:46 PM, Matthew Clarke <
> >> >> [email protected]>
> >> >> > wrote:
> >> >> >
> >> >> >> Lars,
> >> >> >>       What version of NiFi are you running?
> >> >> >>       What type of processor was your source processor?
> >> >> >>       What type of processor was the destination processor?
> >> >> >>       I tried reproducing using a GenerateFlowFile to produce
> ~100k
> >> >> >> Flowfiles on a connection to an UpdateAttribute processor. I then
> >> >> stopped
> >> >> >> the GenerateFlowFile processor , added a funnel, and moved the
> >> >> connection.
> >> >> >> I also added another processor feeding that same funnel and routed
> >> the
> >> >> >> connection from the funnel back to the UpdateAttribute processor.
> >> The
> >> >> >> files moved as expected through the funnnel.
> >> >> >>
> >> >> >>       Can you reproduce?   Any other errors in your app log from
> >> prior
> >> >> to
> >> >> >> completing the connection?
> >> >> >>
> >> >> >> Thanks,
> >> >> >> Matt
> >> >> >>
> >> >> >> On Tue, Feb 16, 2016 at 1:15 PM, Lars Francke <
> >> [email protected]>
> >> >> >> wrote:
> >> >> >>
> >> >> >> > Hi,
> >> >> >> >
> >> >> >> > I'm trying to understand what happened and how I can prevent
> this
> >> in
> >> >> the
> >> >> >> > future.
> >> >> >> >
> >> >> >> > The outcome seems to be that all my FlowFiles which were sitting
> >> in a
> >> >> >> > connection have been deleted from disk.
> >> >> >> >
> >> >> >> > I had a flow with two processors connected via a single
> connection.
> >> >> >> >
> >> >> >> > What I did:
> >> >> >> > * Stop all Processors
> >> >> >> > * Add a Funnel
> >> >> >> > * Add a Processor
> >> >> >> > * Move destination end of existing connection to funnel (with
> the
> >> >> >> existing
> >> >> >> > FlowFiles)
> >> >> >> > * Connect new Processor to Funnel
> >> >> >> > * Connect Funnel to old destination Processor
> >> >> >> >
> >> >> >> > The connection between the Funnel and the Destination processor
> >> still
> >> >> >> shows
> >> >> >> > all 90k FlowFiles but the Processor fails on session.read with a
> >> >> >> > MissingFlowFileException.
> >> >> >> >
> >> >> >> > Sure enough my content_repository is mostly empty too.
> >> >> >> >
> >> >> >> > Now this isn't so bad because it's only a dev environment but
> I'd
> >> >> like to
> >> >> >> > understand how this could happen. Did I do something wrong?
> >> >> >> >
> >> >> >> > Any hints on what to search for in the logs or which place in
> the
> >> >> source
> >> >> >> > code to look?
> >> >> >> >
> >> >> >> > Cheers,
> >> >> >> > Lars
> >> >> >> >
> >> >> >>
> >> >>
> >>
>

Reply via email to