[
https://issues.apache.org/jira/browse/BEAM-13294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454348#comment-17454348
]
Andrew Pilloud commented on BEAM-13294:
---------------------------------------
The issue appears to occur here:
https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/CoGroup.java#L520
These involved schemas should all match, they don't:
Lookup key schema: STRING NOT NULL
Map key schema: STRING
Map value schema: STRING
Testing further, the "Map Key Schema" is nullable if either of the other two
are. This appears to be a widening bug.
> innerBroadcastJoin fails when batch column Nullable, streaming column not
> -------------------------------------------------------------------------
>
> Key: BEAM-13294
> URL: https://issues.apache.org/jira/browse/BEAM-13294
> Project: Beam
> Issue Type: Bug
> Components: dsl-sql, dsl-sql-zetasql, sdk-java-core
> Reporter: Andrew Pilloud
> Assignee: Andrew Pilloud
> Priority: P1
>
> The join always fail when given a batch column that is Nullable and a
> streaming column that is Not Nullable. This was discovered in
> BeamSideInputJoinRel but is likely an issue in the Beam core join library.
> Trivial reproduction with a small test diff:
> ./gradlew :runners:direct-java:needsRunnerTests --tests
> org.apache.beam.sdk.schemas.transforms.JoinTest.testInnerJoinDifferentKeys
> {code}
> ---
> a/sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/JoinTest.java
> +++
> b/sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/JoinTest.java
> @@ -51,7 +51,7 @@ public class JoinTest {
> .build();
> private static final Schema CG_SCHEMA_2 =
> Schema.builder()
> - .addStringField("user2")
> + .addNullableField("user2", Schema.FieldType.STRING)
> .addInt32Field("count2")
> .addStringField("country2")
> .build();
> {code}
> At a lower level:
> ./gradlew :runners:direct-java:needsRunnerTests --tests
> org.apache.beam.sdk.schemas.transforms.CoGroupTest.testCoGroupByDifferentFields
> {code}
> ---
> a/sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/CoGroupTest.java
> +++
> b/sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/transforms/CoGroupTest.java
> @@ -237,7 +237,7 @@ public class CoGroupTest {
>
> private static final Schema CG_SCHEMA_2 =
> Schema.builder()
> - .addStringField("user2")
> + .addNullableField("user2", Schema.FieldType.STRING)
> .addInt32Field("count2")
> .addStringField("country2")
> .build();
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)