StefanRRichter commented on a change in pull request #6558: [FLINK-9116]
Introduce getAll and removeAll for MapState
URL: https://github.com/apache/flink/pull/6558#discussion_r210510928
##########
File path:
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBMapState.java
##########
@@ -129,6 +132,36 @@ public UV get(UK userKey) throws IOException,
RocksDBException {
return (rawValueBytes == null ? null :
deserializeUserValue(rawValueBytes));
}
+ @Override
+ public Map<UK, UV> getAll(Collection<UK> keys) throws IOException,
RocksDBException {
+
+ Map<UK, byte[]> keyBytesMap = new HashMap<>(keys.size());
+ List<ColumnFamilyHandle> columnFamilyHandles = new
ArrayList<>(keys.size());
+ List<byte[]> keyBytesList = new ArrayList<>(keys.size());
+
+ for (UK key : keys) {
+ columnFamilyHandles.add(columnFamily);
+ byte[] keyByte =
serializeUserKeyWithCurrentKeyAndNamespace(key);
+ keyBytesList.add(keyByte);
+ keyBytesMap.put(key, keyByte);
+ }
+
+ Map<byte[], byte[]> result =
backend.db.multiGet(columnFamilyHandles, keyBytesList);
Review comment:
I think we should favor iterating individual keys over adding them all to a
list & multiGet. It seems more more memory efficient.
https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ
> Q: If I want to retrieve 10 keys from rocksdb, is it better to batch them
and use MultiGet() versus issuing 10 individual Get() calls?
>
> A: The performance is similar. MultiGet() reads from the same consistent
view, but it is not faster.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services