carp84 commented on issue #9501: [FLINK-12697] [State Backends] Support on-disk state storage for spill-able heap backend URL: https://github.com/apache/flink/pull/9501#issuecomment-534964705 > 2. That's what I was suspecting. I don't wanna necessarily change it but do we know how much performance we gain by this? I think there would have been also other solutions to the problem. For example, one could have a reusable object which has a mutable `nodeAddress` field. Then one would not have to create a new object every time. Ideally one would hide this behind a factory method which can decide how to handle it. Also the access to the field could have been encapsulated by the object and if needed could have happened lazily. The benefit of this approach would be that it is easier to maintain and test because the API would tell you what such an object can do and we would have a dedicated type instead of longs you have to pass around. About the performance impact, lease refer to [the analysis of JDK CSLM implementation](https://docs.google.com/document/d/16VIY7o-18sM-pIlIYkbTuhKPmwfnqabCt_nlOARAzdg/edit#) and a compacted data structure we introduced for HBase to reduce GC pressure. Search for `Key space schema` and `Value space schema` in `SkipListUtils` and we could find a similar design here. About reusable object, it will add a lot of efforts/complexity making sure to prevent concurrent manipulation on it.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
