It seems as if the Funnel thing wasn't actually the problem.

Here's my new timeline:

18:14... - Stop Processors
18:15:40 - Shutdown NiFi (graceful and successful)
18:28:03 - Starting NiFi which seemingly deletes content
18:31++ - Add Funnel etc. and start Processors again (so only now do I see
the problem occurring even though it probably would have happened without
it as well)

I've uploaded the relevant part of the log here <
http://pastebin.com/6XWP5SVF>

All processors involved are custom processors but they don't do anything
special and have been running for days and survived multiple restarts
already. I can't share code now but if it becomes important I can strip
them to a bare minimum and share.

So when the failure happened it was even easier: CustomSourceProcessor was
connected to CustomDestinationProcessor via a normal connection.

Thanks yet again for helping out everyone!

On Wed, Feb 17, 2016 at 5:04 AM, Aldrin Piri <[email protected]> wrote:

> Lars,
>
> Are you able to share your flow or a template of it so we can try to
> recreate?
>
> If not, could you give some information as to what it is doing and what
> processors/components are involved.  Are there any custom components?
>
> Thanks!
>
> On Tue, Feb 16, 2016 at 10:18 PM, Joe Witt <[email protected]> wrote:
>
> > 'that deletes the original file'
> >
> > True but even then that refers to the original source data and not
> > what it is in the content repository itself.  The content repository
> > error that was emitted about missing flow file exception/content not
> > found is for the purpose of signaling data was removed by some process
> > outside of NiFi.
> >
> > Mark Payne: Any ideas?
> >
> > On Tue, Feb 16, 2016 at 10:15 PM, Thad Guidry <[email protected]>
> > wrote:
> > > There's a checkbox option in the FetchFile that deletes the original
> > file.
> > >
> > >
> >
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFile.java#L62
> > >
> > > static final AllowableValue COMPLETION_DELETE = new
> > AllowableValue("Delete
> > > File", "Delete File", "Deletes the original file from the file
> system");
> > >
> > >
> > > Perhaps its something along those lines, maybe in his other processors
> ?
> > > He mentioned "I also added another processor feeding that same funnel"
> > ...
> > > which processor was that exactly ?
> > >
> > >
> > > Thad
> > > +ThadGuidry <https://www.google.com/+ThadGuidry>
> > >
> > > On Tue, Feb 16, 2016 at 4:35 PM, Lars Francke <[email protected]>
> > > wrote:
> > >
> > >> Thanks for the explanation.
> > >>
> > >> I tried to reproduce but I can't. I also looked through my bash
> history
> > and
> > >> I can't find anything suspicious. I'm pretty sure that nothing deleted
> > >> files in the content_repository that's not NiFi itself. Everything
> else
> > >> (logs etc.) are all untouched and some content files have survived as
> > well.
> > >> A few FlowFiles are being processed successfully and I just checked
> the
> > >> creation date of all files in content_repository. Most of them are
> > "old".
> > >>
> > >> On Tue, Feb 16, 2016 at 11:12 PM, Joe Witt <[email protected]>
> wrote:
> > >>
> > >> > Lars,
> > >> >
> > >> > The information you're providing from the logs is a pretty important
> > >> > bit of debug data.
> > >> >
> > >> > This concept of 'CONTENTMISSING' being recorded into the Flow File
> > >> > Repository is NiFI's way of saying "Hey I knew about this flow file
> > >> > but when I tried to access the content it was no longer in the
> content
> > >> > repository".  What I'm suggesting is something outside of NiFi
> itself
> > >> > removed the content.  By default, even when you remove content using
> > >> > the NiFi API it isn't actually deleting the content until it has to
> > >> > and it is asynchronous.  Even if you had restarted NiFi during this
> I
> > >> > don't see how this could occur.
> > >> >
> > >> > Even if you have some bugs in the custom processor implementations
> the
> > >> > issue you're showing here should not be possible.
> > >> >
> > >> > The only explanation that makes sense to me so far is that the
> content
> > >> > was actually deleted from within the content repository by something
> > >> > other than NiFi.
> > >> >
> > >> > Can you reproduce the issue?
> > >> >
> > >> > Thanks
> > >> > Joe
> > >> >
> > >> > On Tue, Feb 16, 2016 at 4:58 PM, Lars Francke <
> [email protected]
> > >
> > >> > wrote:
> > >> > > Any ideas on how to debug this further?
> > >> > >
> > >> > > I know very little about the internals of NiFi but there are
> > obviously
> > >> > > still references to that content and it shouldn't have been
> deleted.
> > >> Can
> > >> > > you think of a way I could have done this by accident?
> > >> > >
> > >> > > On Tue, Feb 16, 2016 at 10:35 PM, Joe Witt <[email protected]>
> > wrote:
> > >> > >
> > >> > >> Interesting.  What that suggests is the content has been removed
> > from
> > >> > >> the content repo itself.
> > >> > >>
> > >> > >> Thanks
> > >> > >> Joe
> > >> > >>
> > >> > >> On Tue, Feb 16, 2016 at 4:15 PM, Lars Francke <
> > [email protected]
> > >> >
> > >> > >> wrote:
> > >> > >> > I attached a debugger and checked a few of those FlowFiles that
> > >> failed
> > >> > >> and
> > >> > >> > searched the logs for those. This is what I found:
> > >> > >> >
> > >> > >> > 2016-02-16 18:28:35,953 INFO [main]
> > >> > >> o.a.n.c.repository.FileSystemRepository
> > >> > >> > Found unknown file
> > >> > >> >
> > /Users/lars/Downloads/nifi-0.5.0/content_repository/103/14556368398
> > >> > >> > 47-103 (1058303 bytes) in File System Repository; archiving
> file
> > >> > >> >
> > >> > >> > 2016-02-16 18:42:54,840 WARN [Timer-Driven Process Thread-9]
> > >> > >> > o.a.n.c.r.WriteAheadFlowFileRepository Repository Record
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> >
> StandardRepositoryRecord[UpdateType=CONTENTMISSING,Record=StandardFlowFileRecord[uuid=af69ca83-fc03-41f0-91e1-e3d65da54840,claim=StandardContentClaim
> > >> > >> > [resourceClaim=StandardResourceClaim[id=1455636632024-102,
> > >> > >> > container=default, section=102], offset=661978,
> > >> > >> > length=10],offset=0,name=69321836993544,size=10]] is marked to
> be
> > >> > >> aborted;
> > >> > >> > it will be persisted in the FlowFileRepository as a DELETE
> record
> > >> > >> >
> > >> > >> > Now I can't remember having done this but it's entirely
> possible
> > >> that
> > >> > I
> > >> > >> > restarted NiFi prior to my experiment described above.
> > >> > >> >
> > >> > >> >
> > >> > >> > On Tue, Feb 16, 2016 at 9:16 PM, Joe Witt <[email protected]>
> > >> wrote:
> > >> > >> >
> > >> > >> >> Lars,
> > >> > >> >>
> > >> > >> >> Definitely look forward to understanding the mechanics here a
> > bit
> > >> > >> >> better of what you're seeing and if you can provide something
> > >> > >> >> reproducible.  Even if you have a custom processor the
> > API/Process
> > >> > >> >> Session construct should protect from many of the things that
> > can
> > >> go
> > >> > >> >> wrong there.  Now the content repo will likely be large empty
> as
> > >> the
> > >> > >> >> data represents on 888KB of data and it is probably in a
> > relative
> > >> > >> >> small number of files on disk.
> > >> > >> >>
> > >> > >> >> Thanks
> > >> > >> >> joe
> > >> > >> >>
> > >> > >> >> On Tue, Feb 16, 2016 at 2:57 PM, Lars Francke <
> > >> > [email protected]>
> > >> > >> >> wrote:
> > >> > >> >> > Hi Matt,
> > >> > >> >> >
> > >> > >> >> > thanks for the quick response. It's late here so I'll try
> > >> > reproducing
> > >> > >> >> > tomorrow.
> > >> > >> >> >
> > >> > >> >> > Source and destination processors are custom processors.
> > >> > >> >> > This is Nifi 0.5.0 RC3
> > >> > >> >> >
> > >> > >> >> > NiFi thinks all FlowFiles are still there: <
> > >> > http://imgur.com/isDlRk4>
> > >> > >> >> >
> > >> > >> >> > I'm looking at logs now no ERRORs or WARN that seem
> > suspicious so
> > >> > far
> > >> > >> >> >
> > >> > >> >> > On Tue, Feb 16, 2016 at 8:46 PM, Matthew Clarke <
> > >> > >> >> [email protected]>
> > >> > >> >> > wrote:
> > >> > >> >> >
> > >> > >> >> >> Lars,
> > >> > >> >> >>       What version of NiFi are you running?
> > >> > >> >> >>       What type of processor was your source processor?
> > >> > >> >> >>       What type of processor was the destination processor?
> > >> > >> >> >>       I tried reproducing using a GenerateFlowFile to
> produce
> > >> > ~100k
> > >> > >> >> >> Flowfiles on a connection to an UpdateAttribute processor.
> I
> > >> then
> > >> > >> >> stopped
> > >> > >> >> >> the GenerateFlowFile processor , added a funnel, and moved
> > the
> > >> > >> >> connection.
> > >> > >> >> >> I also added another processor feeding that same funnel and
> > >> routed
> > >> > >> the
> > >> > >> >> >> connection from the funnel back to the UpdateAttribute
> > >> processor.
> > >> > >> The
> > >> > >> >> >> files moved as expected through the funnnel.
> > >> > >> >> >>
> > >> > >> >> >>       Can you reproduce?   Any other errors in your app log
> > from
> > >> > >> prior
> > >> > >> >> to
> > >> > >> >> >> completing the connection?
> > >> > >> >> >>
> > >> > >> >> >> Thanks,
> > >> > >> >> >> Matt
> > >> > >> >> >>
> > >> > >> >> >> On Tue, Feb 16, 2016 at 1:15 PM, Lars Francke <
> > >> > >> [email protected]>
> > >> > >> >> >> wrote:
> > >> > >> >> >>
> > >> > >> >> >> > Hi,
> > >> > >> >> >> >
> > >> > >> >> >> > I'm trying to understand what happened and how I can
> > prevent
> > >> > this
> > >> > >> in
> > >> > >> >> the
> > >> > >> >> >> > future.
> > >> > >> >> >> >
> > >> > >> >> >> > The outcome seems to be that all my FlowFiles which were
> > >> sitting
> > >> > >> in a
> > >> > >> >> >> > connection have been deleted from disk.
> > >> > >> >> >> >
> > >> > >> >> >> > I had a flow with two processors connected via a single
> > >> > connection.
> > >> > >> >> >> >
> > >> > >> >> >> > What I did:
> > >> > >> >> >> > * Stop all Processors
> > >> > >> >> >> > * Add a Funnel
> > >> > >> >> >> > * Add a Processor
> > >> > >> >> >> > * Move destination end of existing connection to funnel
> > (with
> > >> > the
> > >> > >> >> >> existing
> > >> > >> >> >> > FlowFiles)
> > >> > >> >> >> > * Connect new Processor to Funnel
> > >> > >> >> >> > * Connect Funnel to old destination Processor
> > >> > >> >> >> >
> > >> > >> >> >> > The connection between the Funnel and the Destination
> > >> processor
> > >> > >> still
> > >> > >> >> >> shows
> > >> > >> >> >> > all 90k FlowFiles but the Processor fails on session.read
> > >> with a
> > >> > >> >> >> > MissingFlowFileException.
> > >> > >> >> >> >
> > >> > >> >> >> > Sure enough my content_repository is mostly empty too.
> > >> > >> >> >> >
> > >> > >> >> >> > Now this isn't so bad because it's only a dev environment
> > but
> > >> > I'd
> > >> > >> >> like to
> > >> > >> >> >> > understand how this could happen. Did I do something
> wrong?
> > >> > >> >> >> >
> > >> > >> >> >> > Any hints on what to search for in the logs or which
> place
> > in
> > >> > the
> > >> > >> >> source
> > >> > >> >> >> > code to look?
> > >> > >> >> >> >
> > >> > >> >> >> > Cheers,
> > >> > >> >> >> > Lars
> > >> > >> >> >> >
> > >> > >> >> >>
> > >> > >> >>
> > >> > >>
> > >> >
> > >>
> >
>

Reply via email to