openinx commented on a change in pull request #2465:
URL: https://github.com/apache/iceberg/pull/2465#discussion_r612209228
##########
File path: core/src/main/java/org/apache/iceberg/util/JsonUtil.java
##########
@@ -116,19 +122,42 @@ public static String getStringOrNull(String property,
JsonNode node) {
}
public static List<String> getStringList(String property, JsonNode node) {
- Preconditions.checkArgument(node.has(property), "Cannot parse missing list
%s", property);
+ ImmutableList.Builder<String> builder = ImmutableList.builder();
+ return getCollection(property, node, builder::build, builder::add,
JsonNode::asText,
+ e -> Preconditions.checkArgument(e.isTextual(), "Cannot parse string
from non-text value: %s", e),
+ false);
+ }
+
+ public static Set<Integer> getIntegerSetOrNull(String property, JsonNode
node) {
+ ImmutableSet.Builder<Integer> builder = ImmutableSet.builder();
+ return getCollection(property, node, builder::build, builder::add,
JsonNode::asInt,
+ e -> Preconditions.checkArgument(e.isInt(), "Cannot parse string from
non-integer value: %s", e),
+ true);
+ }
+
+ private static <C extends Collection<T>, T> C getCollection(
+ String property, JsonNode node,
+ Supplier<C> builder, Consumer<T> appender,
+ Function<JsonNode, T> getter, Consumer<JsonNode> validator,
Review comment:
Those `builder`, `appender`, `getter` and `validator` makes the method
to be really hard to follow, could we just defines a simple `visitor` interface
to visit the `JsonNode` in this collection and collect all the final result to
the upper layer method ?
##########
File path: core/src/main/java/org/apache/iceberg/SchemaUpdate.java
##########
@@ -408,11 +450,15 @@ private TableMetadata applyChangesToMapping(TableMetadata
metadata) {
private static Schema applyChanges(Schema schema, List<Integer> deletes,
Map<Integer, Types.NestedField> updates,
Multimap<Integer, Types.NestedField> adds,
- Multimap<Integer, Move> moves) {
+ Multimap<Integer, Move> moves,
+ Set<String> rowIdentifierNames) {
Types.StructType struct = TypeUtil
.visit(schema, new ApplyChanges(deletes, updates, adds, moves))
.asNestedType().asStructType();
- return new Schema(struct.fields());
+ Set<Integer> refreshedRowIdentifiers = rowIdentifierNames.stream()
+ .map(name -> struct.field(name).fieldId())
Review comment:
If we delete column `b` by using the `SchemaUpdate#deleteColumn` API,
then the `struct` variable which has been applied by those changes won't
include the field that has the colum name `b`, right ? If sure, then how
could we locate the field id for that given column name `b` in this sentence ?
##########
File path: core/src/main/java/org/apache/iceberg/SchemaParser.java
##########
@@ -158,7 +169,7 @@ public static String toJson(Schema schema, boolean pretty) {
if (pretty) {
generator.useDefaultPrettyPrinter();
}
- toJson(schema.asStruct(), generator);
+ toJson(schema.asStruct(), schema.schemaId(), schema.rowIdentifiers(),
generator);
Review comment:
I think this
[PR](https://github.com/apache/iceberg/pull/2096/files#diff-0fd54fb65a9c5d2fce5bd3386550b2e829e0f70c2992fdad8fdb9bd2d866a74fR161)
forget to refresh this `toJson` method.
##########
File path: api/src/main/java/org/apache/iceberg/Schema.java
##########
@@ -355,9 +387,17 @@ private Schema internalSelect(Collection<String> names,
boolean caseSensitive) {
@Override
public String toString() {
- return String.format("table {\n%s\n}",
- NEWLINE.join(struct.fields().stream()
- .map(f -> " " + f)
- .collect(Collectors.toList())));
+ StringBuilder sb = new StringBuilder();
+ sb.append("table {\n");
+ sb.append(" fields {\n");
+ sb.append(NEWLINE.join(struct.fields().stream()
+ .map(f -> " " + f)
+ .collect(Collectors.toList())));
+ sb.append("\n }\n row identifiers { ");
+ sb.append(COMMA.join(Arrays.stream(rowIdentifiers)
+ .mapToObj(id -> String.format("%d", id))
+ .collect(Collectors.toList())));
Review comment:
How about just simplify this as
`sb.append(COMMA.join(lazyRowIdentifiersSet()))` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]