cshannon commented on code in PR #3286:
URL: https://github.com/apache/accumulo/pull/3286#discussion_r1166712180


##########
core/src/main/java/org/apache/accumulo/core/metadata/schema/DataFileValue.java:
##########
@@ -20,42 +20,76 @@
 
 import static java.nio.charset.StandardCharsets.UTF_8;
 
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import java.util.Optional;
+
+import org.apache.accumulo.core.data.Range;
 import org.apache.accumulo.core.data.Value;
 import org.apache.accumulo.core.iteratorsImpl.system.InterruptibleIterator;
 import org.apache.accumulo.core.iteratorsImpl.system.TimeSettingIterator;
+import org.apache.accumulo.core.util.json.RangeAdapter;
+
+import com.google.gson.Gson;
+import com.google.gson.JsonSyntaxException;
 
 public class DataFileValue {
-  private long size;
-  private long numEntries;
+
+  private static final Gson gson = RangeAdapter.createRangeGson();
+
+  private final long size;
+  private final long numEntries;
   private long time = -1;
+  private final List<Range> ranges;

Review Comment:
   I think we should keep using the Range class itself even if we only care 
about rows and the other parts of the Key are null/not set as the Range class 
has a lot of nice utility methods like clip, bound and mergeOverlapping that 
are going to be quite useful. So it seems like an ideal object to use here.
   
   However, you make a good point about the serialization part of it. Since we 
only care about the rows and you mentioned in another comment that metadata 
ranges are exclusive/inclusive then I could do some custom serialization. Right 
now I'm encoding the entire Range object to binary and then to Base64 but that 
includes a lot of extra flags and data to support all those fields in a Key 
that we could leave out if we don't need it. Instead I could probably just 
update the Range Json adapter I created and only serialize the row portions of 
the start/end Key (after encoding each byte array to Base 64) as that should 
cut down on the amount of data that needs to be written and read.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to