haizhou-zhao commented on a change in pull request #4120:
URL: https://github.com/apache/iceberg/pull/4120#discussion_r806294592



##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -90,6 +90,10 @@ static boolean hasIds(Schema schema) {
     return AvroCustomOrderSchemaVisitor.visit(schema, new HasIds());
   }
 
+  static boolean missingIds(Schema schema) {

Review comment:
       Added.

##########
File path: core/src/main/java/org/apache/iceberg/avro/BuildAvroProjection.java
##########
@@ -49,6 +49,11 @@
     this.current = expectedSchema.asStruct();
   }
 
+  BuildAvroProjection(org.apache.iceberg.types.Type expectedType, Map<String, 
String> renames) {

Review comment:
       This is for testing only.

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -357,6 +361,26 @@ static Schema copyRecord(Schema record, List<Schema.Field> 
newFields, String new
     return copy;
   }
 
+  static Schema copyArray(Schema array, Schema elementSchema) {
+    Preconditions.checkArgument(array.getType() == 
org.apache.avro.Schema.Type.ARRAY,
+        "Cannot invoke copyArray on non array schema");
+    Schema copy = Schema.createArray(elementSchema);
+    for (Map.Entry<String, Object> prop : array.getObjectProps().entrySet()) {
+      copy.addProp(prop.getKey(), prop.getValue());
+    }
+    return copy;
+  }
+
+  static Schema copyMap(Schema map, Schema valueSchema) {
+    Preconditions.checkArgument(map.getType() == 
org.apache.avro.Schema.Type.MAP,
+        "Cannot invoke copyMap on non map schema");

Review comment:
       Added

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -357,6 +361,26 @@ static Schema copyRecord(Schema record, List<Schema.Field> 
newFields, String new
     return copy;
   }
 
+  static Schema copyArray(Schema array, Schema elementSchema) {

Review comment:
       Adopted `replaceElement` and `replaceValue`

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -90,6 +90,23 @@ static boolean hasIds(Schema schema) {
     return AvroCustomOrderSchemaVisitor.visit(schema, new HasIds());
   }
 
+  /**
+   * @param schema an Avro Schema
+   * @return true/false based on whether any of the nodes in the provided 
schema is missing an
+   * ID property recognizable by Iceberg core API. To have an ID recognizable 
by Iceberg core API:
+   * <ul>
+   *   <li>a field node under struct (record) schema should have {@link 
FIELD_ID_PROP} property
+   *   <li>an element node under list (array) schema should have {@link 
ELEMENT_ID_PROP} property
+   *   <li>a pair of key and value node under map schema should have {@link 
KEY_ID_PROP} and
+   *   {@link VALUE_ID_PROP} respectively
+   *   <li>a primitive node is not assigned any ID related properties
+   * </ul>

Review comment:
       Moved out of return section and before `param` or `return` in a general 
description.

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -90,6 +90,23 @@ static boolean hasIds(Schema schema) {
     return AvroCustomOrderSchemaVisitor.visit(schema, new HasIds());
   }
 
+  /**
+   * @param schema an Avro Schema
+   * @return true/false based on whether any of the nodes in the provided 
schema is missing an
+   * ID property recognizable by Iceberg core API. To have an ID recognizable 
by Iceberg core API:
+   * <ul>
+   *   <li>a field node under struct (record) schema should have {@link 
FIELD_ID_PROP} property
+   *   <li>an element node under list (array) schema should have {@link 
ELEMENT_ID_PROP} property
+   *   <li>a pair of key and value node under map schema should have {@link 
KEY_ID_PROP} and
+   *   {@link VALUE_ID_PROP} respectively
+   *   <li>a primitive node is not assigned any ID related properties
+   * </ul>
+   * @implNote see {@link MissingIds} for more details

Review comment:
       Make sense, removed.

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -90,6 +90,23 @@ static boolean hasIds(Schema schema) {
     return AvroCustomOrderSchemaVisitor.visit(schema, new HasIds());
   }
 
+  /**
+   * @param schema an Avro Schema

Review comment:
       I tried to follow some example I saw under the 
[Catalog.class](https://github.com/apache/iceberg/blob/master/api/src/main/java/org/apache/iceberg/catalog/Catalog.java#L57),
 basically, a general description, then followed by `param`, `return`, 
`throws`. Let me know if I'm missing some components in this java doc.

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -90,6 +90,23 @@ static boolean hasIds(Schema schema) {
     return AvroCustomOrderSchemaVisitor.visit(schema, new HasIds());
   }
 
+  /**
+   * @param schema an Avro Schema

Review comment:
       I tried to follow some examples I saw under the 
[Catalog.class](https://github.com/apache/iceberg/blob/master/api/src/main/java/org/apache/iceberg/catalog/Catalog.java#L57),
 basically, a general description, then followed by `param`, `return`, 
`throws`. Let me know if I'm missing some components in this java doc.

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -90,6 +90,23 @@ static boolean hasIds(Schema schema) {
     return AvroCustomOrderSchemaVisitor.visit(schema, new HasIds());
   }
 
+  /**
+   * @param schema an Avro Schema
+   * @return true/false based on whether any of the nodes in the provided 
schema is missing an
+   * ID property recognizable by Iceberg core API. To have an ID recognizable 
by Iceberg core API:
+   * <ul>
+   *   <li>a field node under struct (record) schema should have {@link 
FIELD_ID_PROP} property
+   *   <li>an element node under list (array) schema should have {@link 
ELEMENT_ID_PROP} property
+   *   <li>a pair of key and value node under map schema should have {@link 
KEY_ID_PROP} and
+   *   {@link VALUE_ID_PROP} respectively
+   *   <li>a primitive node is not assigned any ID related properties
+   * </ul>
+   * @implNote see {@link MissingIds} for more details

Review comment:
       removed `public`

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -90,6 +90,25 @@ static boolean hasIds(Schema schema) {
     return AvroCustomOrderSchemaVisitor.visit(schema, new HasIds());
   }
 
+  /**
+   * Check if any of the nodes in a given avro schema is missing an ID 
recognizable by Iceberg core API
+   *

Review comment:
       Added

##########
File path: core/src/main/java/org/apache/iceberg/avro/AvroSchemaUtil.java
##########
@@ -90,6 +90,25 @@ static boolean hasIds(Schema schema) {
     return AvroCustomOrderSchemaVisitor.visit(schema, new HasIds());
   }
 
+  /**
+   * Check if any of the nodes in a given avro schema is missing an ID 
recognizable by Iceberg core API

Review comment:
       Make sense. Changed.

##########
File path: core/src/main/java/org/apache/iceberg/avro/MissingIds.java
##########
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.avro;
+
+import java.util.List;
+import java.util.function.Supplier;
+import org.apache.avro.Schema;
+
+/**
+ * Returns true once the first node is found with ID property missing. Reverse 
of {@link HasIds}
+ * <p>
+ * Note: To use {@link AvroSchemaUtil#toIceberg(Schema)} on an avro schema, 
the avro schema need to be either
+ * have IDs on every node or not have IDs at all. Invoke {@link 
AvroSchemaUtil#hasIds(Schema)} only proves
+ * that the schema has at least one ID, and not sufficient condition for 
invoking
+ * {@link AvroSchemaUtil#toIceberg(Schema)} on the schema.
+ */
+public class MissingIds extends AvroCustomOrderSchemaVisitor<Boolean, Boolean> 
{

Review comment:
       Removed `public`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to