At the high level we try not to copy anything unless we have to, so when you say “under NiFi care” it becomes a bit unclear. For example, one may be copying a file using zero-copy algorithm. Let’s assume that NiFi was the facilitator of that process. With that said, the data is/was never under NiFi management because nothing was read into memory to perform copy. Now, even if something is read in memory, what does it really mean from your perspective? Technically one may argue that ‘record’ is now under NiFi management and it could be acknowledged. But what if somewhere downstream the processing of this record fails?
Basically, IMHO your question is about Transactional capabilities where transaction implies that acknowledgment will be provided *only* when a record is fully processed and its re-processing may never happen again with the exception of catastrophic failures. If, so giving asynchronous nature of NiFi, it may not be as straight forward process, albeit doable. But before we get to that, let us know if my rumblings above are not totally off ;). Cheers Oleg > On Dec 8, 2015, at 3:07 AM, Andre <andre-li...@fucs.org> wrote: > > All, > > Still working on the lumberjack processor. Data is currently being decoded, > SSL is sort of working but before I start wrapping up I wanted to confirm: > > Lumberjack is a protocol that includes the dispatch of an acknowledgement > message to the producing agent. > > As consequence, usually a producer tailing a file will only update its > offset AFTER receiving the acknowledgement from the lumberjack endpoint. > > Ideally this acknowledgement should only be sent after data is no longer in > the processor memory buffers and the chances of memory loss are restricted > to catastrophic failure. > > Which leads to my question: From a development point of view, at what stage > data is assumed to be under NiFi's care? > > I thank you in advance.