[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

GitBox Thu, 15 Jul 2021 13:10:26 -0700


dschneider-pivotal commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r670756392




##########
File path: 
geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -258,6 +267,7 @@ public boolean remove(Object o) {
     }
 
     x = deleteNode(x);
+    updateSizeInBytes(x.key, false);

Review comment:
       I think these calls would be a bit easier to understand if the looked 
like this: incSizeInBytes(element) and decSizeInBytes(element). Those two 
methods could call a common method getElementSize(element) if needed. 

##########
File path: 
geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -46,11 +49,15 @@
  * @author Rodion "rodde" Efremov
  * @version 1.6 (Feb 11, 2016)
  */
-public class OrderStatisticsTree<E extends Comparable<? super E>>
+public class SizableOrderStatisticsTree<E extends Comparable<? super E>>

Review comment:
       What was the reason for renaming this class? I noticed you did not 
rename the interface it implements, OrderedStatisticsSet, to 
SizableOrderedStatisticsSet so I was wondering if this needed to be renamed.

##########
File path: 
geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -34,45 +35,44 @@
 import java.util.Objects;
 
 import it.unimi.dsi.fastutil.bytes.ByteArrays;
-import it.unimi.dsi.fastutil.objects.Object2ObjectOpenCustomHashMap;
 
 import org.apache.geode.annotations.VisibleForTesting;
 import org.apache.geode.cache.Region;
 import org.apache.geode.internal.InternalDataSerializer;
 import org.apache.geode.internal.serialization.DeserializationContext;
 import org.apache.geode.internal.serialization.KnownVersion;
 import org.apache.geode.internal.serialization.SerializationContext;
+import org.apache.geode.internal.size.Sizeable;
 import org.apache.geode.redis.internal.RedisConstants;
 import org.apache.geode.redis.internal.collections.OrderStatisticsSet;
-import org.apache.geode.redis.internal.collections.OrderStatisticsTree;
+import org.apache.geode.redis.internal.collections.SizableOrderStatisticsTree;
+import 
org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor;
 import org.apache.geode.redis.internal.delta.AddsDeltaInfo;
 import org.apache.geode.redis.internal.delta.DeltaInfo;
 import org.apache.geode.redis.internal.delta.RemsDeltaInfo;
 import 
org.apache.geode.redis.internal.executor.sortedset.SortedSetRangeOptions;
 import org.apache.geode.redis.internal.executor.sortedset.ZAddOptions;
 
 public class RedisSortedSet extends AbstractRedisData {
-  private Object2ObjectOpenCustomHashMap<byte[], OrderedSetEntry> members;
+  private SizeableObject2ObjectOpenCustomHashMapWithCursor<byte[], 
OrderedSetEntry> members;
   private OrderStatisticsSet<AbstractOrderedSetEntry> scoreSet;
+  // This field is used to keep track of the size associated with objects that 
are referenced by
+  // both backing collections, since they will be counted twice otherwise
+  private int sizeInBytesAdjustment = 0;
 
-  protected static final int BASE_REDIS_SORTED_SET_OVERHEAD = 184;
-  protected static final int PER_PAIR_OVERHEAD = 48;
-
-  private int sizeInBytes = BASE_REDIS_SORTED_SET_OVERHEAD;
+  protected static final int BASE_REDIS_SORTED_SET_OVERHEAD = 40;

Review comment:
       It seems like comments on these OVERHEAD fields would be helpful. They 
should describe what fields were used calculate the overhead. Then in the 
future if someone changes a field or removes one or adds one they could figure 
out how to adjust these constants.

##########
File path: 
geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -731,6 +741,25 @@ private int count(Node<E> node) {
     return leftTreeSize + 1 + rightTreeSize;
   }
 
+  void updateSizeInBytes(E element, boolean isAdd) {
+    int sizeChange;
+    if (element instanceof Sizeable) {

Review comment:
       Would it be worth having our own SizableSizer that would do this 
instanceof check and then delegate to a ReflectionSingleObjectSizer if it is 
not a Sizable? It would clean this call up a bit. I don't know if we have this 
same pattern in other places.

##########
File path: 
geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -171,20 +170,24 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] 
scoreToAdd) {
-    byte[] oldScore = null;
-
     OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.put(memberToAdd, newEntry);
+    if (existingEntry == null) {
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array 
twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + 
calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
+      scoreSet.remove(existingEntry);
       scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      // When updating an entry, the references to the member name array in 
the backing collections

Review comment:
       It seems to me like we should in this update case canonicalize the 
memberToAdd byte array so that we reuse the one already stored in the members 
map instead of this new one. I think you could do this by doing a 
OrderedSetEntry existingEntry = members.get(memberToAdd) and if it returns 
non-null then reading existingEntry.getMember() and using that to create the 
newEntry. If OrderedSetEntry is always read while synchronized then you could 
make it mutable and in that case you wouldn't need the update to create a new 
entry each time. I think you would need to do the 
scoreSet.remove(existingEntry) then mutate existingEntry and then 
scoreSet.add(existingEntry). This would prevent some gc churn.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Reply via email to