We use the ListS3 processor quite a bit, and maintaining state for millions of 
objects in S3 is critical for us. If we lost track of state, it would cause us 
to have to re-download a lot of S3 objects, which is costly.

We use the "local-provider" and the default directory "./state/local"

When I had to migrate to a new instance, we could not afford to lose state, so 
I copied the entire and original "./state/local" directory from the old to the 
new instance. The ListS3 processor in the new instance was able to use the 
state from the old one successfully. I didn't see any documentation on this, 
but I was able to get it to work.

I have not figured out how to manipulate the state intentionally. There are use 
cases where we need to go back in time a few days to relist objects that were 
recent, and so adjusting the state back to a particular date would be helpful 
in certain cases. This would allow us to "re-list" objects based on date 
parameters. As a workaround, I've added date filters.

--

Dave Hirko | [email protected]<mailto:[email protected]> | 571.421.7729

On Sun, 2017-01-22 at 08:21 -0700, Toivo Adams wrote:

Hi,

As far I know ListS3 use NiFi built in StateManager which in turn use
StateProvider's.
NiFi may have different StateProvider implementations.
Currently NiFi have 2 providers, ZooKeeper based and write-ahead log file
based.
ZooKeeper is used when NiFi cluster is configured and other is used for
local single node NiFi.
As I understand NiFi will choose automatically ZooKeeper for cluster and
local for single NiFi instance.

You can Replay FlowFile.
Open Data Provenance, choose Provenance Event, open CONTENT tab and click
REPLAY.
Also many NiFi processors have Failure relationship which is used to route
failed FlowFile’s to some other path. So you can automate how to handle
failed FlowFiles.

Data Provenance is simplest way to see successfully processed files.
But you can create custom Reporting Task to collect Provenance Events and do
what ever you need.

Regards
Toivo



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/In-ListS3-processor-where-does-Nifi-persists-the-state-of-objects-tp14489p14490.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Reply via email to