zhoufek commented on code in PR #17477:
URL: https://github.com/apache/beam/pull/17477#discussion_r860864600
##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java:
##########
@@ -87,22 +88,33 @@ enum MethodType {
* public getter methods, or special annotations on the class.
*/
public static Schema schemaFromClass(
- Class<?> clazz, FieldValueTypeSupplier fieldValueTypeSupplier) {
+ Class<?> clazz,
+ FieldValueTypeSupplier fieldValueTypeSupplier,
+ HashSet<Class> alreadyVisitedSchemas) {
Review Comment:
I'm concerned about modifying a public method, since it will break existing
users. If we plan on breaking something, we want to first deprecate it and then
give users at least a couple versions to change. I don't think that's practical
here. You could make a private method, move the implementation to that method,
and use the public method to just create the `Set` and pass it into the private
method.
The same goes for `fieldFromType`.
##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java:
##########
@@ -87,22 +88,33 @@ enum MethodType {
* public getter methods, or special annotations on the class.
*/
public static Schema schemaFromClass(
- Class<?> clazz, FieldValueTypeSupplier fieldValueTypeSupplier) {
+ Class<?> clazz,
+ FieldValueTypeSupplier fieldValueTypeSupplier,
+ HashSet<Class> alreadyVisitedSchemas) {
+ if (alreadyVisitedSchemas.contains(clazz)) {
+ throw new RuntimeException(
Review Comment:
Prefer the more specific `IllegalArgumentException`. `RuntimeException`
doesn't give us much info as to what is wrong and makes writing good
`try/catch` blocks harder.
##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/StaticSchemaInference.java:
##########
@@ -87,22 +88,33 @@ enum MethodType {
* public getter methods, or special annotations on the class.
*/
public static Schema schemaFromClass(
- Class<?> clazz, FieldValueTypeSupplier fieldValueTypeSupplier) {
+ Class<?> clazz,
+ FieldValueTypeSupplier fieldValueTypeSupplier,
+ HashSet<Class> alreadyVisitedSchemas) {
+ if (alreadyVisitedSchemas.contains(clazz)) {
+ throw new RuntimeException(
+ "Cannot infer schema with a circular reference. Class: " +
clazz.getTypeName());
+ }
+ alreadyVisitedSchemas.add(clazz);
Schema.Builder builder = Schema.builder();
for (FieldValueTypeInformation type : fieldValueTypeSupplier.get(clazz)) {
- Schema.FieldType fieldType = fieldFromType(type.getType(),
fieldValueTypeSupplier);
+ Schema.FieldType fieldType =
+ fieldFromType(type.getType(), fieldValueTypeSupplier,
alreadyVisitedSchemas);
if (type.isNullable()) {
builder.addNullableField(type.getName(), fieldType);
} else {
builder.addField(type.getName(), fieldType);
}
}
+ alreadyVisitedSchemas.remove(clazz);
Review Comment:
Looking at this, I'm wondering if we should instead track the type through a
`Map<Class, Schema>` instead of `Set<Class>`. It might be possible that
multiple types nested under the same type reference each other in a
non-circular way. We might be able to avoid repeating work if we keep a
reference to previously computed schemas. Basically, it would look like:
```
if (visited.containsKey(clazz) && visited.get(clazz) == null) {
throw new IllegalArgumentException(...);
}
visited.put(clazz, null);
// Compute
visited.replace(clazz, schema);
```
Note: I haven't tested this out. I'm just throwing out an idea.
##########
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/JavaFieldSchemaTest.java:
##########
@@ -728,4 +730,46 @@ public void testNoCreateOptionThrows() throws
NoSuchSchemaException {
thrown.getMessage(),
containsString("zero-argument constructor"));
}
+
+ @Test
+ public void testSelfNestedPOJOThrows() throws NoSuchSchemaException {
+ SchemaRegistry registry = SchemaRegistry.createDefault();
Review Comment:
nit: Add a space between creating the registry and getting the exception.
Unit tests should follow the Arrange/Act/Assert design pattern
(http://wiki.c2.com/?ArrangeActAssert), and we generally use spaces to separate
the sections. You can see this in other tests in this file.
##########
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/JavaFieldSchemaTest.java:
##########
@@ -728,4 +730,46 @@ public void testNoCreateOptionThrows() throws
NoSuchSchemaException {
thrown.getMessage(),
containsString("zero-argument constructor"));
}
+
+ @Test
+ public void testSelfNestedPOJOThrows() throws NoSuchSchemaException {
+ SchemaRegistry registry = SchemaRegistry.createDefault();
+ RuntimeException thrown =
+ assertThrows(
+ RuntimeException.class,
+ () -> {
+ registry.getSchema(SelfNestedPOJO.class);
+ });
+
+ assertThat(
+ "Message should suggest not using a circular schema reference.",
+ thrown.getMessage(),
+ containsString("circular reference"));
+
Review Comment:
nit: Remove space. Similar to above, it keeps the Assert section together
and is more consistent with other tests.
This and the above comment should apply to other changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]