'that deletes the original file'

True but even then that refers to the original source data and not
what it is in the content repository itself.  The content repository
error that was emitted about missing flow file exception/content not
found is for the purpose of signaling data was removed by some process
outside of NiFi.

Mark Payne: Any ideas?

On Tue, Feb 16, 2016 at 10:15 PM, Thad Guidry <[email protected]> wrote:
> There's a checkbox option in the FetchFile that deletes the original file.
>
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/FetchFile.java#L62
>
> static final AllowableValue COMPLETION_DELETE = new AllowableValue("Delete
> File", "Delete File", "Deletes the original file from the file system");
>
>
> Perhaps its something along those lines, maybe in his other processors ?
> He mentioned "I also added another processor feeding that same funnel" ...
> which processor was that exactly ?
>
>
> Thad
> +ThadGuidry <https://www.google.com/+ThadGuidry>
>
> On Tue, Feb 16, 2016 at 4:35 PM, Lars Francke <[email protected]>
> wrote:
>
>> Thanks for the explanation.
>>
>> I tried to reproduce but I can't. I also looked through my bash history and
>> I can't find anything suspicious. I'm pretty sure that nothing deleted
>> files in the content_repository that's not NiFi itself. Everything else
>> (logs etc.) are all untouched and some content files have survived as well.
>> A few FlowFiles are being processed successfully and I just checked the
>> creation date of all files in content_repository. Most of them are "old".
>>
>> On Tue, Feb 16, 2016 at 11:12 PM, Joe Witt <[email protected]> wrote:
>>
>> > Lars,
>> >
>> > The information you're providing from the logs is a pretty important
>> > bit of debug data.
>> >
>> > This concept of 'CONTENTMISSING' being recorded into the Flow File
>> > Repository is NiFI's way of saying "Hey I knew about this flow file
>> > but when I tried to access the content it was no longer in the content
>> > repository".  What I'm suggesting is something outside of NiFi itself
>> > removed the content.  By default, even when you remove content using
>> > the NiFi API it isn't actually deleting the content until it has to
>> > and it is asynchronous.  Even if you had restarted NiFi during this I
>> > don't see how this could occur.
>> >
>> > Even if you have some bugs in the custom processor implementations the
>> > issue you're showing here should not be possible.
>> >
>> > The only explanation that makes sense to me so far is that the content
>> > was actually deleted from within the content repository by something
>> > other than NiFi.
>> >
>> > Can you reproduce the issue?
>> >
>> > Thanks
>> > Joe
>> >
>> > On Tue, Feb 16, 2016 at 4:58 PM, Lars Francke <[email protected]>
>> > wrote:
>> > > Any ideas on how to debug this further?
>> > >
>> > > I know very little about the internals of NiFi but there are obviously
>> > > still references to that content and it shouldn't have been deleted.
>> Can
>> > > you think of a way I could have done this by accident?
>> > >
>> > > On Tue, Feb 16, 2016 at 10:35 PM, Joe Witt <[email protected]> wrote:
>> > >
>> > >> Interesting.  What that suggests is the content has been removed from
>> > >> the content repo itself.
>> > >>
>> > >> Thanks
>> > >> Joe
>> > >>
>> > >> On Tue, Feb 16, 2016 at 4:15 PM, Lars Francke <[email protected]
>> >
>> > >> wrote:
>> > >> > I attached a debugger and checked a few of those FlowFiles that
>> failed
>> > >> and
>> > >> > searched the logs for those. This is what I found:
>> > >> >
>> > >> > 2016-02-16 18:28:35,953 INFO [main]
>> > >> o.a.n.c.repository.FileSystemRepository
>> > >> > Found unknown file
>> > >> > /Users/lars/Downloads/nifi-0.5.0/content_repository/103/14556368398
>> > >> > 47-103 (1058303 bytes) in File System Repository; archiving file
>> > >> >
>> > >> > 2016-02-16 18:42:54,840 WARN [Timer-Driven Process Thread-9]
>> > >> > o.a.n.c.r.WriteAheadFlowFileRepository Repository Record
>> > >> >
>> > >>
>> >
>> StandardRepositoryRecord[UpdateType=CONTENTMISSING,Record=StandardFlowFileRecord[uuid=af69ca83-fc03-41f0-91e1-e3d65da54840,claim=StandardContentClaim
>> > >> > [resourceClaim=StandardResourceClaim[id=1455636632024-102,
>> > >> > container=default, section=102], offset=661978,
>> > >> > length=10],offset=0,name=69321836993544,size=10]] is marked to be
>> > >> aborted;
>> > >> > it will be persisted in the FlowFileRepository as a DELETE record
>> > >> >
>> > >> > Now I can't remember having done this but it's entirely possible
>> that
>> > I
>> > >> > restarted NiFi prior to my experiment described above.
>> > >> >
>> > >> >
>> > >> > On Tue, Feb 16, 2016 at 9:16 PM, Joe Witt <[email protected]>
>> wrote:
>> > >> >
>> > >> >> Lars,
>> > >> >>
>> > >> >> Definitely look forward to understanding the mechanics here a bit
>> > >> >> better of what you're seeing and if you can provide something
>> > >> >> reproducible.  Even if you have a custom processor the API/Process
>> > >> >> Session construct should protect from many of the things that can
>> go
>> > >> >> wrong there.  Now the content repo will likely be large empty as
>> the
>> > >> >> data represents on 888KB of data and it is probably in a relative
>> > >> >> small number of files on disk.
>> > >> >>
>> > >> >> Thanks
>> > >> >> joe
>> > >> >>
>> > >> >> On Tue, Feb 16, 2016 at 2:57 PM, Lars Francke <
>> > [email protected]>
>> > >> >> wrote:
>> > >> >> > Hi Matt,
>> > >> >> >
>> > >> >> > thanks for the quick response. It's late here so I'll try
>> > reproducing
>> > >> >> > tomorrow.
>> > >> >> >
>> > >> >> > Source and destination processors are custom processors.
>> > >> >> > This is Nifi 0.5.0 RC3
>> > >> >> >
>> > >> >> > NiFi thinks all FlowFiles are still there: <
>> > http://imgur.com/isDlRk4>
>> > >> >> >
>> > >> >> > I'm looking at logs now no ERRORs or WARN that seem suspicious so
>> > far
>> > >> >> >
>> > >> >> > On Tue, Feb 16, 2016 at 8:46 PM, Matthew Clarke <
>> > >> >> [email protected]>
>> > >> >> > wrote:
>> > >> >> >
>> > >> >> >> Lars,
>> > >> >> >>       What version of NiFi are you running?
>> > >> >> >>       What type of processor was your source processor?
>> > >> >> >>       What type of processor was the destination processor?
>> > >> >> >>       I tried reproducing using a GenerateFlowFile to produce
>> > ~100k
>> > >> >> >> Flowfiles on a connection to an UpdateAttribute processor. I
>> then
>> > >> >> stopped
>> > >> >> >> the GenerateFlowFile processor , added a funnel, and moved the
>> > >> >> connection.
>> > >> >> >> I also added another processor feeding that same funnel and
>> routed
>> > >> the
>> > >> >> >> connection from the funnel back to the UpdateAttribute
>> processor.
>> > >> The
>> > >> >> >> files moved as expected through the funnnel.
>> > >> >> >>
>> > >> >> >>       Can you reproduce?   Any other errors in your app log from
>> > >> prior
>> > >> >> to
>> > >> >> >> completing the connection?
>> > >> >> >>
>> > >> >> >> Thanks,
>> > >> >> >> Matt
>> > >> >> >>
>> > >> >> >> On Tue, Feb 16, 2016 at 1:15 PM, Lars Francke <
>> > >> [email protected]>
>> > >> >> >> wrote:
>> > >> >> >>
>> > >> >> >> > Hi,
>> > >> >> >> >
>> > >> >> >> > I'm trying to understand what happened and how I can prevent
>> > this
>> > >> in
>> > >> >> the
>> > >> >> >> > future.
>> > >> >> >> >
>> > >> >> >> > The outcome seems to be that all my FlowFiles which were
>> sitting
>> > >> in a
>> > >> >> >> > connection have been deleted from disk.
>> > >> >> >> >
>> > >> >> >> > I had a flow with two processors connected via a single
>> > connection.
>> > >> >> >> >
>> > >> >> >> > What I did:
>> > >> >> >> > * Stop all Processors
>> > >> >> >> > * Add a Funnel
>> > >> >> >> > * Add a Processor
>> > >> >> >> > * Move destination end of existing connection to funnel (with
>> > the
>> > >> >> >> existing
>> > >> >> >> > FlowFiles)
>> > >> >> >> > * Connect new Processor to Funnel
>> > >> >> >> > * Connect Funnel to old destination Processor
>> > >> >> >> >
>> > >> >> >> > The connection between the Funnel and the Destination
>> processor
>> > >> still
>> > >> >> >> shows
>> > >> >> >> > all 90k FlowFiles but the Processor fails on session.read
>> with a
>> > >> >> >> > MissingFlowFileException.
>> > >> >> >> >
>> > >> >> >> > Sure enough my content_repository is mostly empty too.
>> > >> >> >> >
>> > >> >> >> > Now this isn't so bad because it's only a dev environment but
>> > I'd
>> > >> >> like to
>> > >> >> >> > understand how this could happen. Did I do something wrong?
>> > >> >> >> >
>> > >> >> >> > Any hints on what to search for in the logs or which place in
>> > the
>> > >> >> source
>> > >> >> >> > code to look?
>> > >> >> >> >
>> > >> >> >> > Cheers,
>> > >> >> >> > Lars
>> > >> >> >> >
>> > >> >> >>
>> > >> >>
>> > >>
>> >
>>

Reply via email to