bvolpato opened a new pull request, #29412: URL: https://github.com/apache/beam/pull/29412
After https://github.com/apache/beam/pull/27851, user code that depends on versions newer than Avro 1.8.2 are having problems running on Dataflow. For example in https://github.com/GoogleCloudPlatform/DataflowTemplates, where we moved on to Avro 1.11.3, there were incompatibility errors: > Caused by: java.io.InvalidClassException: org.apache.avro.specific.SpecificRecordBase; local class incompatible: stream classdesc serialVersionUID = -1463700717714793795, local class serialVersionUID = 189988654766568477 and > Caused by: java.lang.NoSuchMethodError: 'boolean org.apache.avro.generic.GenericRecord.hasField(java.lang.String)' com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.getMetadataIsDeleted(FormatDatastreamRecordToJson.java:258) com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.apply(FormatDatastreamRecordToJson.java:123) com.google.cloud.teleport.v2.transforms.FormatDatastreamRecordToJson.apply(FormatDatastreamRecordToJson.java:51) org.apache.beam.sdk.extensions.avro.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:610) The root cause is that Avro classes are now being shipped along with the `/opt/apache/beam/jars/beam-sdks-java-harness.jar`, which wasn't the case before. Mark Avro as provided should solve the problem and allow users to control their Avro. (Tried to relocate in https://github.com/apache/beam/pull/29407 but got some test failures, so trying this as an alternative) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
