geruh commented on code in PR #14702:
URL: https://github.com/apache/iceberg/pull/14702#discussion_r2572924506
##########
core/src/main/java/org/apache/iceberg/ContentFileParser.java:
##########
@@ -301,4 +291,55 @@ private static Metrics metricsFromJson(JsonNode jsonNode) {
lowerBounds,
upperBounds);
}
+
+ private static void partitionToJson(
+ Types.StructType partitionType, StructLike partitionData, JsonGenerator
generator)
+ throws IOException {
+ generator.writeStartArray();
+ List<Types.NestedField> fields = partitionType.fields();
+ for (int pos = 0; pos < fields.size(); ++pos) {
+ Types.NestedField field = fields.get(pos);
+ Object partitionValue = partitionData.get(pos, Object.class);
+ SingleValueParser.toJson(field.type(), partitionValue, generator);
+ }
+ generator.writeEndArray();
+ }
+
+ private static PartitionData partitionFromJson(
+ Types.StructType partitionType, JsonNode partitionNode) {
+ List<Types.NestedField> fields = partitionType.fields();
+ PartitionData partitionData = new PartitionData(partitionType);
+
+ if (partitionNode.isArray()) {
+ Preconditions.checkArgument(
+ partitionNode.size() == fields.size(),
+ "Invalid partition data size: expected = %s, actual = %s",
+ fields.size(),
+ partitionNode.size());
+
+ for (int pos = 0; pos < fields.size(); ++pos) {
+ Types.NestedField field = fields.get(pos);
+ Object partitionValue = SingleValueParser.fromJson(field.type(),
partitionNode.get(pos));
+ partitionData.set(pos, partitionValue);
+ }
+ } else if (partitionNode.isObject()) {
+ // Handle legacy partition object serialization
+ for (int pos = 0; pos < fields.size(); ++pos) {
+ Types.NestedField field = fields.get(pos);
+ String fieldId = String.valueOf(field.fieldId());
+ if (partitionNode.has(fieldId)) {
+ Object partitionValue =
+ SingleValueParser.fromJson(field.type(),
partitionNode.get(fieldId));
+ partitionData.set(pos, partitionValue);
+ }
+ }
Review Comment:
Yeah I agree, I thought this was a bit more defensive but the original
cleans things up. As for the precondition the structlike tracks the schema size
so case will pretty much always be true. Furthermore, the parsing is sparse
meaning nulls are omitted so I think it would be safer to have it be a <= case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]