Mike, There is a DetectDuplicate processor. It gives you the ability to provide an attribute to use for identification (for example, using a SHA256 hash or looking at an identifier in the data or a filename, etc). It uses a DistributedMapCacheClient to track this so it could be backed by Redis or whatever other implementations we have available. Would that give you what you need?
Thanks -Mark Sent from my iPhone > On Dec 15, 2018, at 8:52 AM, Mike Thomsen <[email protected]> wrote: > > We are getting a lot of independent submissions of data from various and > sundry teams that work with our client, and our client may need a processor > that roughly does this story: > > "as a NiFi user, I would like to be able to detect whether a file has been > seen before and processed based on feedback from a RDBMS/HBase/Elastic and > then be able to choose whether to reprocess it or drop it." > > Want to make sure that I'm not reinventing the wheel before writing such a > processor. > > Thanks, > > Mike
