nonbinaryprogrammer commented on pull request #6296:
URL: https://github.com/apache/geode/pull/6296#issuecomment-834729967


   Draft of commit message (feel free to leave comments):
   GEODE-9136: Make RedisData implement Sizeable
   
   - use Sizeable in RedisString, RedisSet, and RedisHash
   - add unit tests of bytes in use for all three classes
   - update memory overhead tests to reflect decrease in overhead 
   - calculate estimated size in unit tests and use constants in the classes
   
   RedisData should implement Sizeable to increase the accuracy of the 
estimation of bytes in use. It is important that we are close to the right size 
in order for rebalancing to work properly, however the exact value is not 
important. These changes make the size for sets and hashes be within 5% of the 
size measured by using reflection. The accuracy changes as you add entries, and 
as entries get larger, because of the way that hashes, sets, and 
ByteArrayWrappers get resized. As we fill up the hash and set, we become more 
accurate in our estimation of the size. When full, the next entry added causes 
the structure to be resized, which means that our data now only accounts for a 
portion of the total size of the structure. We could have kept track of the 
resizes that happen in order to be even more accurate in our estimations, but 
doing so would add computational overhead for adding each key, and for each 
field at that key. It was decided that the benefits do not outweigh the cost.
   
   RedisHash and RedisSet both use constants to account for the overheads of:
   - an empty (RedisHash/RedisSet)
   - an empty (HashMap/HashSet) within the (RedisHash/RedisSet)
   - each member in the (hash/set)
   RedisHash also has an overhead for the first value in the hash.
   We decided to use constants in the RedisData classes in order to reduce the 
number of calculations we have to do every time we add a new key. The math used 
to derive those constants live in the RedisSetTest and RedisHashTest classes. 
The tests of the constants will fail if the internal implementation of the 
classes changes. If the overheads decrease, then the constants need to be 
updated to reflect that. If the overheads increase, think through the changes 
before adding to the overhead. Significant changes to RedisSet or RedisHash 
could potentially break the math, so be sure to check where the problem is, in 
case we need to change the calculations.
   There is a magical +5 in 
RedisSetTest.perMemberOverheadConstant_shouldMatchCalculatedValue. We think it 
came from the resizing, and that 5 happens to work ok, even though it should be 
dynamically calculated. But as previously said, the benefits do not outweigh 
the cost.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to