[ 
https://issues.apache.org/jira/browse/GEODE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342783#comment-17342783
 ] 

ASF subversion and git services commented on GEODE-9136:
--------------------------------------------------------

Commit 6a0eba25d5ed5cc7146ce6374d39dd12b22745f3 in geode's branch 
refs/heads/develop from Hale Bales
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=6a0eba2 ]

GEODE-9136: make RedisData implement Sizeable (#6296)

 - use Sizeable in RedisString, RedisSet, and RedisHash
 - add unit tests of bytes in use for all three classes
 - update memory overhead tests to reflect decrease in overhead
 - calculate estimated size in unit tests and use constants in the classes

RedisData should implement Sizeable to increase the accuracy of the estimation 
of bytes in use. It is important that we are close to the right size in order 
for rebalancing to work properly, however the exact value is not important. 
These changes make the size for sets and hashes be within 5% of the size 
measured by using reflection. The accuracy changes as you add entries, and as 
entries get larger, because of the way that hashes, sets, and ByteArrayWrappers 
get resized. As we fill up the hash and set, we become more accurate in our 
estimation of the size. When full, the next entry added causes the structure to 
be resized, which means that our data now only accounts for a portion of the 
total size of the structure. We could have kept track of the resizes that 
happen in order to be even more accurate in our estimations, but doing so would 
add computational overhead for adding each key, and for each field at that key. 
It was decided that the benefits do not outweigh the cost.

RedisHash and RedisSet both use constants to account for the overheads of:

 - an empty (RedisHash/RedisSet)
 - an empty (HashMap/HashSet) within the (RedisHash/RedisSet)
 - each member in the (hash/set)
 - RedisHash also has an overhead for the first value in the hash.
We decided to use constants in the RedisData classes in order to reduce the 
number of calculations we have to do every time we add a new key. The math used 
to derive those constants live in the RedisSetTest and RedisHashTest classes. 
The tests of the constants will fail if the internal implementation of the 
classes changes. If the overheads decrease, then the constants need to be 
updated to reflect that. If the overheads increase, think through the changes 
before adding to the overhead. Significant changes to RedisSet or RedisHash 
could potentially break the math, so be sure to check where the problem is, in 
case we need to change the calculations.
There is a magical +5 in 
RedisSetTest.perMemberOverheadConstant_shouldMatchCalculatedValue. We think it 
came from the resizing, and that 5 happens to work ok, even though it should be 
dynamically calculated. But as previously said, the benefits do not outweigh 
the cost.

> Redis: make RedisData implement Sizeable
> ----------------------------------------
>
>                 Key: GEODE-9136
>                 URL: https://issues.apache.org/jira/browse/GEODE-9136
>             Project: Geode
>          Issue Type: New Feature
>          Components: redis
>            Reporter: Hale Bales
>            Assignee: Hale Bales
>            Priority: Major
>              Labels: pull-request-available
>
> If RedisData.java implemented the 
> org/apache/geode/internal/size/Sizeable.java interface, size calculations for 
> deltas could be much faster.
> **AC**
> - add tests for String, Set, and Hash add, append, and remove operations 
> changing the bytesInUse
> - all existing hydra tests pass



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to