[ https://issues.apache.org/jira/browse/FLINK-24597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yun Tang resolved FLINK-24597. ------------------------------ Resolution: Fixed Merged master: a907d92673a711612b287d184c00dad7fa42269f release-1.14: 547b3befcff50bf8fc2bef4e596cd39a55c7d4b2 release-1.13: 77824d10edb3fd299c242865a3a03fe08e540a85 > RocksdbStateBackend getKeysAndNamespaces would return duplicate data when > using MapState > ----------------------------------------------------------------------------------------- > > Key: FLINK-24597 > URL: https://issues.apache.org/jira/browse/FLINK-24597 > Project: Flink > Issue Type: Bug > Components: API / State Processor, Runtime / State Backends > Affects Versions: 1.14.0, 1.12.4, 1.13.3 > Reporter: Yue Ma > Assignee: Yue Ma > Priority: Major > Labels: pull-request-available > Fix For: 1.15.0, 1.14.1, 1.13.4 > > Attachments: image-2021-11-01-14-19-58-372.png > > > For example, in RocksdbStateBackend , if we worked in VoidNamespace , and And > use the ValueState like below . > {code:java} > // insert record > for (int i = 0; i < 3; ++i) { > keyedStateBackend.setCurrentKey(i); > testValueState.update(String.valueOf(i)); > } > {code} > Then we get all the keysAndNamespace according the method > RocksDBKeyedStateBackend#getKeysAndNamespaces().The result of the traversal is > <1,VoidNamespace>,<2,VoidNamespace>,<3,VoidNamespace> ,which is as expected. > Thus,if we use MapState , and update the MapState with different user key, > the getKeysAndNamespaces would return duplicate data with same > keyAndNamespace. > {code:java} > // insert record > for (int i = 0; i < 3; ++i) { > keyedStateBackend.setCurrentKey(i); > mapState.put("userKeyA_" + i, "userValue"); > mapState.put("userKeyB_" + i, "userValue"); > } > {code} > The result of the traversal is > > <1,VoidNamespace>,<1,VoidNamespace>,<2,VoidNamespace>,<2,VoidNamespace>,<3,VoidNamespace>,<3,VoidNamespace>. > By reading the code, I found that the main reason for this problem is in the > implementation of _RocksStateKeysAndNamespaceIterator_. > In the _hasNext_ method, when a new keyAndNamespace is created, there is no > comparison with the previousKeyAndNamespace. So we can refer to > RocksStateKeysIterator to implement the same logic should solve this problem. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)