[
https://issues.apache.org/jira/browse/FLINK-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stefan Richter updated FLINK-10043:
-----------------------------------
Description:
Currently, the constructor of {{RocksDBKeyedStateBackend}} has the following
shortcomings:
- It does creation and cleanup of some directories and files. This makes it
harder to unit-test because dependencies are created in the constructor and not
passed in from outside.
- It leaves many important fields uninitialized and more methods e.g.
{{restore}} _have_ to be called before the backend object is fully constructed.
This is error-prone in many ways and hard to unit-test. I think the origin of
this problem was introducing incremental snapshots, because in this case, we
can only open a RocksDB instance AFTER the restore code was executed and
restored the working directory.
As a solution, I would suggest to have a dedicated builder class that takes the
current constructor parameters and (optional) the state handles to restore.
Then, this class constructs and intializes all required objects, and those
objects are only passed to the new {{RocksDBKeyedStateBackend}} constructor
that does no other work besided assigning dependencies to fields.
With this change, I would also extract the different restore strategies for
incremental and full snapshots out of the backend's main class, into their own
classes. They will then be used in the newly introduced builder from the
previous step. This builder would receive all objects that currently go into
the constructor and the restore method. It should create all directories, and
(if applicable) download state, create and restore a RocksDB instance object,
create and register states. Everything concerning the construction of
collaboratores for the backend should go into the builder and the backend main
class should can simply receive all collaboratores and assign them to final
fields.
One detail to concider for the builder is that all resources for collaboratores
should be created and initialized in a resource-acquisition-is-initialization
(RAII) style, in particular because some of them are backed by native (JNI)
objects: If we fail to create a resource during the process, all previously
created resources should properly be released and de-allocated.
Releasing/De-allocation should happen in the excact inverse order of creation,
to avoid any transitive double-frees in the native code. Only when all
resources are created, the builder will create the main backend object, so
again that the constuctor does not have to deal with any fault handling or
cleanup-logic.
was:
Currently, the constructor of {{RocksDBKeyedStateBackend}} has the following
shortcomings:
- It does initialization and cleanup of some directories and files. this makes
it harder to unit-test because dependencies are created in the constructor and
not passed in from outside.
- It leaves many important fields uninitialized and more methods e.g.
{{restore}} _have_ to be called before the object is fully constructed. This is
error-prone in many ways and hard to unit-test. I think the origin of this
problem was introducing incremental snapshots, because in this case, we can
only open a RocksDB instance AFTER the restore code was executed and restored
the working directory.
As a solution, I would suggest to have a dedicated builder class that takes the
current constructor parameters and (optional) the state handles to restore.
Then, this class constructs and intializes all dependencies, and dependencies
are only passed to the new {{RocksDBKeyedStateBackend}} constructor that does
no other work besided assigning dependencies to fields.
With this change, I would also extract the different restore strategies for
incremental and full snapshots out of the main class, into their own classes.
They will then be used in the newly introduced builder from the previous step.
> Refactor object construction/inititlization/restore code
> --------------------------------------------------------
>
> Key: FLINK-10043
> URL: https://issues.apache.org/jira/browse/FLINK-10043
> Project: Flink
> Issue Type: Sub-task
> Reporter: Stefan Richter
> Assignee: Stefan Richter
> Priority: Major
>
> Currently, the constructor of {{RocksDBKeyedStateBackend}} has the following
> shortcomings:
> - It does creation and cleanup of some directories and files. This makes it
> harder to unit-test because dependencies are created in the constructor and
> not passed in from outside.
> - It leaves many important fields uninitialized and more methods e.g.
> {{restore}} _have_ to be called before the backend object is fully
> constructed. This is error-prone in many ways and hard to unit-test. I think
> the origin of this problem was introducing incremental snapshots, because in
> this case, we can only open a RocksDB instance AFTER the restore code was
> executed and restored the working directory.
> As a solution, I would suggest to have a dedicated builder class that takes
> the current constructor parameters and (optional) the state handles to
> restore. Then, this class constructs and intializes all required objects, and
> those objects are only passed to the new {{RocksDBKeyedStateBackend}}
> constructor that does no other work besided assigning dependencies to fields.
> With this change, I would also extract the different restore strategies for
> incremental and full snapshots out of the backend's main class, into their
> own classes. They will then be used in the newly introduced builder from the
> previous step. This builder would receive all objects that currently go into
> the constructor and the restore method. It should create all directories, and
> (if applicable) download state, create and restore a RocksDB instance object,
> create and register states. Everything concerning the construction of
> collaboratores for the backend should go into the builder and the backend
> main class should can simply receive all collaboratores and assign them to
> final fields.
> One detail to concider for the builder is that all resources for
> collaboratores should be created and initialized in a
> resource-acquisition-is-initialization (RAII) style, in particular because
> some of them are backed by native (JNI) objects: If we fail to create a
> resource during the process, all previously created resources should properly
> be released and de-allocated. Releasing/De-allocation should happen in the
> excact inverse order of creation, to avoid any transitive double-frees in the
> native code. Only when all resources are created, the builder will create the
> main backend object, so again that the constuctor does not have to deal with
> any fault handling or cleanup-logic.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)