restore code

Stefan Richter (JIRA) Mon, 28 Jan 2019 02:31:13 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Stefan Richter updated FLINK-10043:
-----------------------------------
    Description: 
Currently, the constructor of {{RocksDBKeyedStateBackend}} has the following 
shortcomings:
- It does creation and cleanup of some directories and files. This makes it 
harder to unit-test because dependencies are created in the constructor and not 
passed in from outside.
- It leaves many important fields uninitialized and more methods e.g. 
{{restore}} _have_ to be called before the backend object is fully constructed. 
This is error-prone in many ways and hard to unit-test. I think the origin of 
this problem was introducing incremental snapshots, because in this case, we 
can only open a RocksDB instance AFTER the restore code was executed and 
restored the working directory.

As a solution, I would suggest to have a dedicated builder class that takes the 
current constructor parameters and (optional) the state handles to restore. 
Then, this class constructs and intializes all required objects, and those 
objects are only passed to the new {{RocksDBKeyedStateBackend}} constructor 
that does no other work besided assigning dependencies to fields.

With this change, I would also extract the different restore strategies for 
incremental and full snapshots out of the backend's main class, into their own 
classes. They will then be used in the newly introduced builder from the 
previous step. This builder would receive all objects that currently go into 
the constructor and the restore method. It should create all directories, and 
(if applicable) download state, create and restore a RocksDB instance object, 
create and register states. Everything concerning the construction of 
collaboratores for the backend should go into the builder and the backend main 
class should can simply receive all collaboratores and assign them to final 
fields.

One detail to concider for the builder is that all resources for collaboratores 
should be created and initialized in a resource-acquisition-is-initialization 
(RAII) style, in particular because some of them are backed by native (JNI) 
objects: If we fail to create a resource during the process, all previously 
created resources should properly be released and de-allocated. 
Releasing/De-allocation should happen in the excact inverse order of creation, 
to avoid any transitive double-frees in the native code. Only when all 
resources are created, the builder will create the main backend object, so 
again that the constuctor does not have to deal with any fault handling or 
cleanup-logic.  

  was:
Currently, the constructor of {{RocksDBKeyedStateBackend}} has the following 
shortcomings:
- It does initialization and cleanup of some directories and files. this makes 
it harder to unit-test because dependencies are created in the constructor and 
not passed in from outside.
- It leaves many important fields uninitialized and more methods e.g. 
{{restore}} _have_ to be called before the object is fully constructed. This is 
error-prone in many ways and hard to unit-test. I think the origin of this 
problem was introducing incremental snapshots, because in this case, we can 
only open a RocksDB instance AFTER the restore code was executed and restored 
the working directory.

As a solution, I would suggest to have a dedicated builder class that takes the 
current constructor parameters and (optional) the state handles to restore. 
Then, this class constructs and intializes all dependencies, and dependencies 
are only passed to the new {{RocksDBKeyedStateBackend}} constructor that does 
no other work besided assigning dependencies to fields.

With this change, I would also extract the different restore strategies for 
incremental and full snapshots out of the main class, into their own classes. 
They will then be used in the newly introduced builder from the previous step.


> Refactor object construction/inititlization/restore code
> --------------------------------------------------------
>
>                 Key: FLINK-10043
>                 URL: https://issues.apache.org/jira/browse/FLINK-10043
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>            Priority: Major
>
> Currently, the constructor of {{RocksDBKeyedStateBackend}} has the following 
> shortcomings:
> - It does creation and cleanup of some directories and files. This makes it 
> harder to unit-test because dependencies are created in the constructor and 
> not passed in from outside.
> - It leaves many important fields uninitialized and more methods e.g. 
> {{restore}} _have_ to be called before the backend object is fully 
> constructed. This is error-prone in many ways and hard to unit-test. I think 
> the origin of this problem was introducing incremental snapshots, because in 
> this case, we can only open a RocksDB instance AFTER the restore code was 
> executed and restored the working directory.
> As a solution, I would suggest to have a dedicated builder class that takes 
> the current constructor parameters and (optional) the state handles to 
> restore. Then, this class constructs and intializes all required objects, and 
> those objects are only passed to the new {{RocksDBKeyedStateBackend}} 
> constructor that does no other work besided assigning dependencies to fields.
> With this change, I would also extract the different restore strategies for 
> incremental and full snapshots out of the backend's main class, into their 
> own classes. They will then be used in the newly introduced builder from the 
> previous step. This builder would receive all objects that currently go into 
> the constructor and the restore method. It should create all directories, and 
> (if applicable) download state, create and restore a RocksDB instance object, 
> create and register states. Everything concerning the construction of 
> collaboratores for the backend should go into the builder and the backend 
> main class should can simply receive all collaboratores and assign them to 
> final fields.
> One detail to concider for the builder is that all resources for 
> collaboratores should be created and initialized in a 
> resource-acquisition-is-initialization (RAII) style, in particular because 
> some of them are backed by native (JNI) objects: If we fail to create a 
> resource during the process, all previously created resources should properly 
> be released and de-allocated. Releasing/De-allocation should happen in the 
> excact inverse order of creation, to avoid any transitive double-frees in the 
> native code. Only when all resources are created, the builder will create the 
> main backend object, so again that the constuctor does not have to deal with 
> any fault handling or cleanup-logic.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10043) Refactor object construction/inititlization/restore code

Reply via email to