kbendick commented on a change in pull request #1430:
URL: https://github.com/apache/iceberg/pull/1430#discussion_r487588916
##########
File path: api/src/main/java/org/apache/iceberg/Metrics.java
##########
@@ -120,4 +124,90 @@ public Long recordCount() {
public Map<Integer, ByteBuffer> upperBounds() {
return upperBounds;
}
+
+ /**
+ * Implemented the method to enable serialization of ByteBuffers.
+ * @param out The stream where to write
+ * @throws IOException On serialization error
+ */
+ private void writeObject(ObjectOutputStream out) throws IOException {
+ out.writeObject(rowCount);
+ out.writeObject(columnSizes);
+ out.writeObject(valueCounts);
+ out.writeObject(nullValueCounts);
+
+ writeByteBufferMap(out, lowerBounds);
+ writeByteBufferMap(out, upperBounds);
+ }
Review comment:
Oh, actually, I spoke to soon. There is documentation on the
serialization of types when storing them as lower bounds / upper bounds in the
maps of manifest files, namely in the `data_files` struct:
https://iceberg.apache.org/spec/#manifests
There's also documentation on what single values can be stored in the upper
bounds / lower bounds byte buffer maps and how they are serialized:
https://iceberg.apache.org/spec/#appendix-d-single-value-serialization
Perhaps something from that documentation page can be used to make this
serialization choice inline with the existing stuff? Or at the least I would
recommend that documentation be added, though I think that you might want to
consider submitting this PR to the dev mailing list at [email protected],
as I feel that likely some people will have an opinion about the introduction
of an additional standard for serialization (especially given that it's part of
the public facing API). You can see the dev mailing list archive here:
https://www.mail-archive.com/[email protected]/
##########
File path: api/src/main/java/org/apache/iceberg/Metrics.java
##########
@@ -120,4 +124,90 @@ public Long recordCount() {
public Map<Integer, ByteBuffer> upperBounds() {
return upperBounds;
}
+
+ /**
+ * Implemented the method to enable serialization of ByteBuffers.
+ * @param out The stream where to write
+ * @throws IOException On serialization error
+ */
+ private void writeObject(ObjectOutputStream out) throws IOException {
+ out.writeObject(rowCount);
+ out.writeObject(columnSizes);
+ out.writeObject(valueCounts);
+ out.writeObject(nullValueCounts);
+
+ writeByteBufferMap(out, lowerBounds);
+ writeByteBufferMap(out, upperBounds);
+ }
Review comment:
Oh, actually, I spoke too soon. There is documentation on the
serialization of types when storing them as lower bounds / upper bounds in the
maps of manifest files, namely in the `data_files` struct:
https://iceberg.apache.org/spec/#manifests
There's also documentation on what single values can be stored in the upper
bounds / lower bounds byte buffer maps and how they are serialized:
https://iceberg.apache.org/spec/#appendix-d-single-value-serialization
Perhaps something from that documentation page can be used to make this
serialization choice inline with the existing stuff? Or at the least I would
recommend that documentation be added, though I think that you might want to
consider submitting this PR to the dev mailing list at [email protected],
as I feel that likely some people will have an opinion about the introduction
of an additional standard for serialization (especially given that it's part of
the public facing API). You can see the dev mailing list archive here:
https://www.mail-archive.com/[email protected]/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]