[ 
https://issues.apache.org/jira/browse/BEAM-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019629#comment-17019629
 ] 

Aaron Dixon edited comment on BEAM-9144 at 1/20/20 4:34 PM:
------------------------------------------------------------

Attempting to validate with 2.20.0-SNAPSHOT [1]

My pipeline however is unable to start in Dataflow, I get:

{noformat}
java.lang.ClassNotFoundException: 
org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Message$Builder
{noformat}

When I inspect my build artifacts I find that when I depend on Beam 2.16.0 
(last Beam build that works for me), I find this class present (v1p21p0). When 
I build against Beam 2.20.0-SNAPSHOT, only v1p26p0 is resolved and packaged.

* I assume that the Dataflow runner needs to somehow know to look for v1p26p0 
somehow?  I.e., do I need to send it a flag that I'm suing 2.20.0-SNAPSHOT so 
that it can know to try to resolve the later versions of the rpc message (?) 
classes?
* --or-- Is this a bug in 2.20.0-SNAPSHOT?


===
[1] Complete list of 2.20.0-SNAPSHOT dependencies resolved by my build:


{noformat}
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.pom
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.pom
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.pom
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.pom
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.pom
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.pom
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.pom
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.pom
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.jar
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.jar
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.jar
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.jar
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.jar
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.jar
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.jar
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.jar
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.jar
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.jar
{noformat}


was (Author: atdixon):
Attempting to validate with 2.20.0-SNAPSHOT [1]

My pipeline however is unable to start in Dataflow, I get:
`java.lang.ClassNotFoundException: 
org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.Message$Builder`

When I inspect my build artifacts I find that when I depend on Beam 2.16.0 
(last Beam build that works for me), I find this class present (v1p21p0). When 
I build against Beam 2.20.0-SNAPSHOT, only v1p26p0 is resolved and packaged.

* I assume that the Dataflow runner needs to somehow know to look for v1p26p0 
somehow?  I.e., do I need to send it a flag that I'm suing 2.20.0-SNAPSHOT so 
that it can know to try to resolve the later versions of the rpc message (?) 
classes?
* --or-- Is this a bug in 2.20.0-SNAPSHOT?


===
[1] Complete list of 2.20.0-SNAPSHOT dependencies resolved by my build:
```
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.pom
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.pom
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.pom
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.pom
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.pom
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.pom
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.pom
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.pom
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.pom
.../beam-sdks-java-core/2.20.0-SNAPSHOT/beam-sdks-java-core-2.20.0-20200120.071349-5.jar
.../beam-model-job-management/2.20.0-SNAPSHOT/beam-model-job-management-2.20.0-20200120.070335-5.jar
.../beam-model-pipeline/2.20.0-SNAPSHOT/beam-model-pipeline-2.20.0-20200120.070348-5.jar
.../beam-sdks-java-io-kafka/2.20.0-SNAPSHOT/beam-sdks-java-io-kafka-2.20.0-20200120.072626-5.jar
.../beam-runners-google-cloud-dataflow-java/2.20.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.20.0-20200120.070606-5.jar
.../beam-sdks-java-extensions-google-cloud-platform-core/2.20.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.20.0-20200120.072059-5.jar
.../beam-sdks-java-io-google-cloud-platform/2.20.0-SNAPSHOT/beam-sdks-java-io-google-cloud-platform-2.20.0-20200120.072441-5.jar
.../beam-sdks-java-extensions-protobuf/2.20.0-SNAPSHOT/beam-sdks-java-extensions-protobuf-2.20.0-20200120.072149-5.jar
.../beam-runners-core-construction-java/2.20.0-SNAPSHOT/beam-runners-core-construction-java-2.20.0-20200120.070440-5.jar
.../beam-runners-direct-java/2.20.0-SNAPSHOT/beam-runners-direct-java-2.20.0-20200120.070532-5.jar
```

> Beam's own Avro TimeConversion class in beam-sdk-java-core 
> -----------------------------------------------------------
>
>                 Key: BEAM-9144
>                 URL: https://issues.apache.org/jira/browse/BEAM-9144
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Tomo Suzuki
>            Assignee: Tomo Suzuki
>            Priority: Major
>             Fix For: 2.19.0
>
>         Attachments: avro-beam-dependency-graph.png
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> From Aaron's comment in 
> https://issues.apache.org/jira/browse/BEAM-8388?focusedCommentId=17016476&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17016476
>  .
> {quote}My org must use Avro 1.9.x (due to some Avro schema resolution issues 
> resolved in 1.9.x) so downgrading Avro is not possible for us.
>  Beam 2.16.0 is compatible with our usage of Avro 1.9.x – but upgrading to 
> 2.17.0 we are broken as 2.17.0 links to Java classes in Avro 1.8.x that are 
> not available in 1.9.x.
> {quote}
> The Java class is 
> {{org.apache.avro.data.TimeConversions.TimestampConversion}} in Avro 1.8.
>  It's renamed to {{org.apache.avro.data.JodaTimeConversions}} in Avro 1.9.
> h1. Beam Java SDK cannot upgrade Avro to 1.9
> Beam has Spark runners and Spark has not yet upgraded to Avro 1.9.
> Illustration of the dependency
> !avro-beam-dependency-graph.png|width=799,height=385!
> h1. Short-term Solution
> As illustrated above, as long as Beam Java SDK uses only the intersection of 
> Avro classes, method, and fields between Avro 1.8 and 1.9, it will provide 
> flexibility in runtime Avro versions (as it did until Beam 2.16).
> h2. Difference of the TimeConversion Classes
> Avro 1.9's TimestampConversion overrides {{getRecommendedSchema}} method. 
> Details below:
> Avro 1.8's TimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion<DateTime> {
>     @Override
>     public Class<DateTime> getConvertedType() {
>       return DateTime.class;
>     }
>     @Override
>     public String getLogicalTypeName() {
>       return "timestamp-millis";
>     }
>     @Override
>     public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>       return new DateTime(millisFromEpoch, DateTimeZone.UTC);
>     }
>     @Override
>     public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>       return timestamp.getMillis();
>     }
>   }
> {code}
> Avro 1.9's JodaTimeConversions.TimestampConversion:
> {code:java}
>   public static class TimestampConversion extends Conversion<DateTime> {
>     @Override
>     public Class<DateTime> getConvertedType() {
>       return DateTime.class;
>     }
>     @Override
>     public String getLogicalTypeName() {
>       return "timestamp-millis";
>     }
>     @Override
>     public DateTime fromLong(Long millisFromEpoch, Schema schema, LogicalType 
> type) {
>       return new DateTime(millisFromEpoch, DateTimeZone.UTC);
>     }
>     @Override
>     public Long toLong(DateTime timestamp, Schema schema, LogicalType type) {
>       return timestamp.getMillis();
>     }
>     @Override
>     public Schema getRecommendedSchema() {
>       return 
> LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
>     }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to