zhoufek commented on code in PR #17477:
URL: https://github.com/apache/beam/pull/17477#discussion_r861244041
##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java:
##########
@@ -79,30 +80,59 @@ enum MethodType {
.put(BigDecimal.class, FieldType.DECIMAL)
.build();
+ /** Helper method that instantiates HashMap to verify that schemas don't
reference themselves. */
Review Comment:
The comment on the public method should remain the same. Users want to know
what this offers them, not how it is implemented, which is what this comment is
about. Even if most of the implementation is in the private method, it is the
helper to this one.
##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java:
##########
@@ -79,30 +80,59 @@ enum MethodType {
.put(BigDecimal.class, FieldType.DECIMAL)
.build();
+ /** Helper method that instantiates HashMap to verify that schemas don't
reference themselves. */
+ public static Schema schemaFromClass(
+ Class<?> clazz, FieldValueTypeSupplier fieldValueTypeSupplier) {
+ return schemaFromClass(clazz, fieldValueTypeSupplier, new HashMap<Class,
Schema>());
+ }
+
/**
* Infer a schema from a Java class.
*
* <p>Takes in a function to extract a list of field types from a class.
Different callers may
* have different strategies for extracting this list: e.g. introspecting
public member variables,
* public getter methods, or special annotations on the class.
*/
- public static Schema schemaFromClass(
- Class<?> clazz, FieldValueTypeSupplier fieldValueTypeSupplier) {
+ private static Schema schemaFromClass(
+ Class<?> clazz,
+ FieldValueTypeSupplier fieldValueTypeSupplier,
+ HashMap<Class, Schema> alreadyVisitedSchemas) {
+ if (alreadyVisitedSchemas.containsKey(clazz)) {
+ Schema existingSchema = alreadyVisitedSchemas.get(clazz);
+ if (existingSchema == null) {
+ throw new IllegalArgumentException(
+ "Cannot infer schema with a circular reference. Class: " +
clazz.getTypeName());
+ }
+ return existingSchema;
+ }
+ alreadyVisitedSchemas.put(clazz, null);
Schema.Builder builder = Schema.builder();
for (FieldValueTypeInformation type : fieldValueTypeSupplier.get(clazz)) {
- Schema.FieldType fieldType = fieldFromType(type.getType(),
fieldValueTypeSupplier);
+ Schema.FieldType fieldType =
+ fieldFromType(type.getType(), fieldValueTypeSupplier,
alreadyVisitedSchemas);
if (type.isNullable()) {
builder.addNullableField(type.getName(), fieldType);
} else {
builder.addField(type.getName(), fieldType);
}
}
- return builder.build();
+ alreadyVisitedSchemas.remove(clazz);
+ Schema generatedSchema = builder.build();
+ alreadyVisitedSchemas.put(clazz, generatedSchema);
+ return generatedSchema;
}
- /** Map a Java field type to a Beam Schema FieldType. */
+ /** Helper method that instantiates HashMap to verify that schemas don't
reference themselves. */
Review Comment:
Similar to the above. It's the private method that helps this one. This
should keep the same doc comment as before.
##########
sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoSchemaTranslator.java:
##########
@@ -156,6 +165,16 @@ static Schema getSchema(Class<? extends Message> clazz) {
}
static Schema getSchema(Descriptors.Descriptor descriptor) {
Review Comment:
Sorry, missed this: The method is now no longer thread-safe due to the use
of a static hash map. I think that's acceptable for a package-private method,
but this should be added to the Javadoc in case anyone in the future tries to
use this in a multithreaded context.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]