[ 
https://issues.apache.org/jira/browse/FLINK-26992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-26992:
-----------------------------------
    Summary: Passing directly between threads PojoSerializer may cause 
ConcurrentModificationException  (was: PojoSerializer may cause concurrent 
exception passing directly between threads)

> Passing directly between threads PojoSerializer may cause 
> ConcurrentModificationException
> -----------------------------------------------------------------------------------------
>
>                 Key: FLINK-26992
>                 URL: https://issues.apache.org/jira/browse/FLINK-26992
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / State Backends
>    Affects Versions: 1.16.0
>            Reporter: Yuan Mei
>            Priority: Major
>
> Extract from FLINK-26835
> While investigating this issue, we have found that probably state backends 
> are also using non-thread safe serialisers from different threads.
> For example: {{RocksFullSnapshotStrategy#syncPrepareResources}} is passing 
> {{keySerializer}} from the task thread, to the async thread in order to 
> serialize the serializer itself. 
> {{RocksIncrementalSnapshotStrategy.RocksDBIncrementalSnapshotOperation#materializeMetaData}}
>  seems to be doing the same thing. If {{PojoSerializer}} is used as 
> {{keySerializer}} I think this will lead to the same problems as above. 
> Iterating through the {{PojoSerializer#subclassSerializerCache}} from the the 
> async checkpoint thread, while the map can be changed from the task thread. 
> It looks like in all of those places the serializer should have been 
> duplicated ({{{}#duplicate{}}}) before being passed to another thread. Maybe 
> this should happen in {{{}RocksDBSnapshotStrategyBase{}}}. I don't know about 
> other state backends.
>  
> =======
> TODO:
> I second that each thread obtains ownership of the Serializer passed in 
> itself.
>  * Figure out whether the current way of passing Serializer really causing 
> problems in the state backend (concurrent modification possible).
>  * What other places have Serializer directly passed between threads.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to