[
https://issues.apache.org/jira/browse/FLINK-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891945#comment-15891945
]
Stefan Richter commented on FLINK-5917:
---------------------------------------
I agree with [~aljoscha]. Unlike Java's HashMap, RocksDB does not give any
feedback if a key already existed and this makes a lot of sense if you consider
the LSM implementation of RocksDB. RocksDB is offering an estimate on the
key-count (probably implemented as hyper-log-log), but I think they would
expose the exact count if this could be easily done. The only workaround that I
see to maintain a cached count is to perform a lookup before every insert,
which should be prohibitively expensive.
> Remove MapState.size()
> ----------------------
>
> Key: FLINK-5917
> URL: https://issues.apache.org/jira/browse/FLINK-5917
> Project: Flink
> Issue Type: Improvement
> Components: DataStream API
> Affects Versions: 1.3.0
> Reporter: Aljoscha Krettek
>
> I'm proposing to remove {{size()}} because it is a prohibitively expensive
> operation and users might not be aware of it. Instead of {{size()}} users can
> use an iterator over all mappings to determine the size, when doing this they
> will be aware of the fact that it is a costly operation.
> Right now, {{size()}} is only costly on the RocksDB state backend but I think
> with future developments on the in-memory state backend it might also become
> an expensive operation there.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)