About stateful transformations

Juan Rodríguez Hortalá Sun, 23 Oct 2016 13:30:06 -0700

Hi all,

I don't have much experience with Flink, so please forget me if I ask some
obvious questions. I was taking a look to the documentation on stateful
transformations in Flink at https://ci.apache.org/projects/flink/flink-docs-
release-1.2/dev/state.html. I'm mostly interested in Flink for stream
processing, and I would like to know:


- What is the biggest state that has been used in production deployments?
I'm interested in how many GB of state, among all key-value pairs, have
been used before in long running streaming jobs deployed in production.
Maybe some company has shared this in some conference or blog post. I guess
for that RocksDB backend is the best option for big states, to avoid being
limited by the available memory.

- Is there any pre built functionality for state eviction? I'm thinking of
LRU cache-like behavior, with eviction based on time or size, similar to
Guava cache (https://github.com/google/guava/wiki/CachesExplained). This is
probably easy to implement anyway, by using the clear() primitive, but I
wouldn't like to reinvent the wheel if this is already implemented
somewhere.

- When using file:// for the checkpointing URL, is the data replicated in
the workers, or a failure in a worker leads to losing the state stored in
that worker? I guess with hdfs:// we get the replication of HDFS, and we
don't have that problem. I also guess S3 can be used for checkpointing
state too, is there any remarkable performance impact of using s3 instead
of HDFS for checkpointing? I guess some performance is lost compared to a
setup running in YARN with collocated DataNodes and NodeManagers, but I
wanted to know if the impact is negible, as checkpointing is performed at a
relatively slow frequency. Also I'm interested on Flink running on EMR,
where the impact of this should be even smaller because the access to S3 is
faster from EMR than from an in-house YARN cluster outside the AWS cloud.

- Is there any problem with the RocksDB backend on top of HDFS related to
defragmentation? How is clear handled for long running jobs? I'm thinking
on a streaming job that has a state with a size of several hundred GBs,
where each key-pair is stored for a week and then deleted. How does clear()
work, and how do you deal with the "small files problem" of HDFS (
http://inquidia.com/news-and-info/working-small-files-hadoop-part-1) for
the FsState and RocksDB backend on top of HDFS? I guess this wouldn' t be a
problem for S3, as it is an object store that has no problem with small
files.

Thanks a lot for your help!

Greetings,

Juan Rodriguez Hortala

About stateful transformations

Reply via email to