zhoufek commented on code in PR #17477:
URL: https://github.com/apache/beam/pull/17477#discussion_r860920916


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java:
##########
@@ -87,22 +88,33 @@ enum MethodType {
    * public getter methods, or special annotations on the class.
    */
   public static Schema schemaFromClass(
-      Class<?> clazz, FieldValueTypeSupplier fieldValueTypeSupplier) {
+      Class<?> clazz,
+      FieldValueTypeSupplier fieldValueTypeSupplier,
+      HashSet<Class> alreadyVisitedSchemas) {
+    if (alreadyVisitedSchemas.contains(clazz)) {
+      throw new RuntimeException(
+          "Cannot infer schema with a circular reference. Class: " + 
clazz.getTypeName());
+    }
+    alreadyVisitedSchemas.add(clazz);
     Schema.Builder builder = Schema.builder();
     for (FieldValueTypeInformation type : fieldValueTypeSupplier.get(clazz)) {
-      Schema.FieldType fieldType = fieldFromType(type.getType(), 
fieldValueTypeSupplier);
+      Schema.FieldType fieldType =
+          fieldFromType(type.getType(), fieldValueTypeSupplier, 
alreadyVisitedSchemas);
       if (type.isNullable()) {
         builder.addNullableField(type.getName(), fieldType);
       } else {
         builder.addField(type.getName(), fieldType);
       }
     }
+    alreadyVisitedSchemas.remove(clazz);

Review Comment:
   Looking at this, I'm wondering if we should instead track the type through a 
`Map<Class, Schema>` instead of `Set<Class>`. It might be possible that 
multiple types nested under the same type reference each other in a 
non-circular way. We might be able to avoid repeating work if we keep a 
reference to previously computed schemas. Basically, it would look like:
   
   ```
   if (visited.containsKey(clazz)) {
     Schema value = visited.get(clazz);
     if (value == null) {
       throw new IllegalArgumentException(...);
     }
     return value;
   }
   visited.put(clazz, null);
   // Compute
   visited.replace(clazz, schema);
   ```
   
   Note: I haven't tested this out. I'm just throwing out an idea.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to