carp84 commented on issue #9501: [FLINK-12697] [State Backends] Support on-disk 
state storage for spill-able heap backend
URL: https://github.com/apache/flink/pull/9501#issuecomment-534964705
 
 
   > 2. That's what I was suspecting. I don't wanna necessarily change it but 
do we know how much performance we gain by this? I think there would have been 
also other solutions to the problem. For example, one could have a reusable 
object which has a mutable `nodeAddress` field. Then one would not have to 
create a new object every time. Ideally one would hide this behind a factory 
method which can decide how to handle it. Also the access to the field could 
have been encapsulated by the object and if needed could have happened lazily. 
The benefit of this approach would be that it is easier to maintain and test 
because the API would tell you what such an object can do and we would have a 
dedicated type instead of longs you have to pass around.
   
   About the performance impact, lease refer to [the analysis of JDK CSLM 
implementation](https://docs.google.com/document/d/16VIY7o-18sM-pIlIYkbTuhKPmwfnqabCt_nlOARAzdg/edit#)
 and a compacted data structure we introduced for HBase to reduce GC pressure. 
Search for `Key space schema` and `Value space schema` in `SkipListUtils` and 
we could find a similar design here.
   
   About reusable object, it will add a lot of efforts/complexity making sure 
to prevent concurrent manipulation on it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to