GitHub user StefanRRichter opened a pull request:

    https://github.com/apache/flink/pull/4764

    [FLINK-7757] [checkpointing] Introduce resource guard for RocksDBKeye…

    …dStateBackend to reduce locking and avoid blocking behavior.
    
    ## What is the purpose of the change
    
    `RocksDBKeyedStateBackend` uses a lock to guard the db instance against 
disposal of the native resources while some parallel threads might still access 
db, which might otherwise lead to segfaults.
    
    Unfortunately, this locking is a bit to strict and can lead to situations 
where snapshots block the pipeline. This can happen when a snapshot s1 is 
running and somewhere blocking in IO while holding the guarding lock. A second 
snapshot s2 can be triggered in parallel and requires to hold the lock in the 
synchronous part to get a snapshot from db. As s1 is still holding on to the 
lock, s2 can block here and stop the operator from processing further elements.
    A simple solution could remove lock acquisition from the synchronous phase, 
because both, synchronous phase and disposing the backend are only allowed to 
be triggered from the thread that also drives element processing.
    
    This PR removes long sections under the lock all together, to open up the 
possibility of parallel checkpointing. The change introduces a guard for the 
rocksdb instance that blocks disposal for as long as there are still clients 
potentially accessing the instance in parallel. This is realized by keeping a 
synchronized counter for active clients and block disposal until the client 
count drops to zero.
    
    This approach could also be integrated with triggering timers, which have 
always been problematic in the disposal phase are currently unregulated. In the 
new model, they could register as yet another client.
    
    
    ## Brief change log
    
      - *Introduce a class `ResourceGuard` as client counter*
      - *Replaced use of `asyncSnapshotLock` with the guard*
    
    ## Verifying this change
    
    This change is mostly covered by existing tests.
    
    I added unit test `ResourceGuardTest` and 
`StateBackendTestBase::testParallelAsyncSnapshots` to ensure that parallel 
snapshots work.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StefanRRichter/flink rocksdb-guard

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4764.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4764
    
----
commit 13824b726dfdae2954840baf4d114ccc7963f484
Author: Stefan Richter <[email protected]>
Date:   2017-10-03T13:37:58Z

    [FLINK-7757] [checkpointing] Introduce resource guard for 
RocksDBKeyedStateBackend to reduce locking and avoid blocking behavior.

----


---

Reply via email to