[
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019629#comment-17019629
]
Aaron Dixon edited comment on BEAM-9144 at 1/20/20 4:34 PM:
------------------------------------------------------------
Attempting to validate with 2.20.0-SNAPSHOT [1]
My pipeline however is unable to start in Dataflow, I get:
{noformat}
java.lang.ClassNotFoundException:
org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Message$Builder
{noformat}
When I inspect my build artifacts I find that when I depend on Beam 2.16.0
(last Beam build that works for me), I find this class present (v1p21p0). When
I build against Beam 2.20.0-SNAPSHOT, only v1p26p0 is resolved and packaged.
* I assume that the Dataflow runner needs to somehow know to look for v1p26p0
somehow? I.e., do I need to send it a flag that I'm suing 2.20.0-SNAPSHOT so
that it can know to try to resolve the later versions of the rpc message (?)
classes?
* --or-- Is this a bug in 2.20.0-SNAPSHOT?
===
[1] Complete list of 2.20.0-SNAPSHOT dependencies resolved by my build:
{noformat}
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.pom
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.pom
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.pom
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.pom
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.pom
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.pom
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.pom
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.pom
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.jar
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.jar
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.jar
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.jar
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.jar
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.jar
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.jar
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.jar
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.jar
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.jar
{noformat}
was (Author: atdixon):
Attempting to validate with 2.20.0-SNAPSHOT [1]
My pipeline however is unable to start in Dataflow, I get:
`java.lang.ClassNotFoundException:
org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Message$Builder`
When I inspect my build artifacts I find that when I depend on Beam 2.16.0
(last Beam build that works for me), I find this class present (v1p21p0). When
I build against Beam 2.20.0-SNAPSHOT, only v1p26p0 is resolved and packaged.
* I assume that the Dataflow runner needs to somehow know to look for v1p26p0
somehow? I.e., do I need to send it a flag that I'm suing 2.20.0-SNAPSHOT so
that it can know to try to resolve the later versions of the rpc message (?)
classes?
* --or-- Is this a bug in 2.20.0-SNAPSHOT?
===
[1] Complete list of 2.20.0-SNAPSHOT dependencies resolved by my build:
```
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.pom
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.pom
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.pom
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.pom
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.pom
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.pom
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.pom
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.pom
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.jar
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.jar
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.jar
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.jar
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.jar
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.jar
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.jar
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.jar
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.jar
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.jar
```
> Beam's own Avro TimeConversion class in beam-sdk-java-core
> -----------------------------------------------------------
>
> Key: BEAM-9144
> URL: https://issues.apache.org/jira/browse/BEAM-9144
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Reporter: Tomo Suzuki
> Assignee: Tomo Suzuki
> Priority: Major
> Fix For: 2.19.0
>
> Attachments: avro-beam-dependency-graph.png
>
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> From Aaron's comment in
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
> .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues
> resolved in 1.9.x) so downgrading Avro is not possible for us.
> Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are
> not available in 1.9.x.
> {quote}
> The Java class is
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
> It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method.
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
> public static class TimestampConversion extends Conversion<DateTime> {
> @Override
> public Class<DateTime> getConvertedType() {
> return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
> return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType
> type) {
> return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
> return timestamp.getMillis();
> }
> }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
> public static class TimestampConversion extends Conversion<DateTime> {
> @Override
> public Class<DateTime> getConvertedType() {
> return DateTime.class;
> }
> @Override
> public String getLogicalTypeName() {
> return "timestamp-millis";
> }
> @Override
> public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType
> type) {
> return new DateTime(millisFromEpoch, DateTimeZone.UTC);
> }
> @Override
> public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
> return timestamp.getMillis();
> }
> @Override
> public Schema getRecommendedSchema() {
> return
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)