DonalEvans commented on a change in pull request #6831:
URL: https://github.com/apache/geode/pull/6831#discussion_r702030694



##########
File path: 
geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -377,6 +382,25 @@ long zrevrank(byte[] member) {
     return scoreSet.size() - scoreSet.indexOf(orderedSetEntry) - 1;
   }
 
+  ImmutablePair<Integer, List<byte[]>> zscan(Pattern matchPattern, int count, 
int cursor) {
+    // No need to allocate more space than it's possible to use given the size 
of the sorted set. We
+    // need to add 1 to zcard() to ensure that if count > members.size(), we 
return a cursor of 0
+    long maximumCapacity = 2L * Math.min(count, zcard() + 1);
+    if (maximumCapacity > Integer.MAX_VALUE) {
+      LogService.getLogger().info(
+          "The size of the data to be returned by zscan, {}, exceeds the 
maximum capacity of an array. A value for the ZSCAN COUNT argument less than {} 
should be used",
+          maximumCapacity, Integer.MAX_VALUE / 2);
+      throw new IllegalArgumentException("Requested array size exceeds VM 
limit");
+    }
+    List<byte[]> resultList = new ArrayList<>((int) maximumCapacity);
+    do {
+      cursor = members.scan(cursor, 1,

Review comment:
       The difficulty here is that Redis has different behaviour depending on 
the size of the sorted set/hash. For sizes less than a certain amount, they 
internally use a different data structure, which results in different behaviour:
   
   >"When iterating Sets encoded as intsets (small sets composed of just 
integers), or Hashes and Sorted Sets encoded as ziplists (small hashes and sets 
composed of small individual values), usually all the elements are returned in 
the first SCAN call regardless of the COUNT value."
   
   from https://redis.io/commands/scan#the-count-option
   
   But for larger sizes, a scan command may return no results (except a cursor 
value) when used with MATCH and COUNT. I created a hash with ~500 entries, 
three of which matched the pattern `a*`, and called HSCAN with `MATCH a*` and 
various COUNT values. Redis returned an empty array for COUNT values up to 106, 
and only one result for COUNT values up to 243, so it's not accurate to say 
that the COUNT argument indicates the number of matching entries that should be 
returned.
   
   Given the inconsistent behaviour of Redis' scan commands, I'm not sure that 
we want to try to exactly match behaviour for situations where a full iteration 
isn't guaranteed, because our internal implementation will not behave the same 
for small set/hash sizes. This can be tested by creating a hash with 20 
entries, then calling HSCAN with a COUNT value of 5. Redis will return all 
entries, but our implementation will return 5.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to