[ https://issues.apache.org/jira/browse/FLINK-6219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950388#comment-15950388 ]
Xiaogang Shi commented on FLINK-6219: ------------------------------------- I prefer to use sorted states (e.g., {{SortedMapState}}) rather than a new state backend to address the described problem. Some users have mentioned similar demands for sorted states. Hence I think we should provide them to facilitate the development of user applications. The implementation of such sorted states however may be very challenging. In {{HeapStateBackend}}, we need to implement a data structure which supports both Copy-on-Write (for asynchronous snapshotting) and sorting. In {{RocksDBStateBackend}} , we need to find an efficient way to support customized sorting. Though RocksDBJava allows customized comparators, the performance will be significantly degraded once a customized comparator is used (approximately 1/3 - 1/15 in QPS). It's critical to address the problems mentioned above. Otherwise, {{ValueState}} s whose data is typed {{SortedMap}} are better to sort user data under the same key. > Add a state backend which supports sorting > ------------------------------------------ > > Key: FLINK-6219 > URL: https://issues.apache.org/jira/browse/FLINK-6219 > Project: Flink > Issue Type: New Feature > Components: State Backends, Checkpointing, Table API & SQL > Reporter: sunjincheng > > When we implement the OVER window of > [FLIP11|https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations] > We notice that we need a state backend which supports sorting, allows for > efficient insertion, traversal in order, and removal from the head. > For example: In event-time OVER window, we need to sort by time,If the datas > as follow: > {code} > (1L, 1, Hello) > (2L, 2, Hello) > (5L, 5, Hello) > (4L, 4, Hello) > {code} > We randomly insert the datas, just like: > {code} > put((2L, 2, Hello)),put((1L, 1, Hello)),put((5L, 5, Hello)),put((4L, 4, > Hello)), > {code} > We deal with elements in time order: > {code} > process((1L, 1, Hello)),process((2L, 2, Hello)),process((4L, 4, > Hello)),process((5L, 5, Hello)) > {code} > Welcome anyone to give feedback,And what do you think? [~xiaogang.shi] > [~aljoscha] [~fhueske] -- This message was sent by Atlassian JIRA (v6.3.15#6346)