Avro divides large JSON schema string into parts - is this intentional?

mark Mon, 10 Aug 2015 02:06:07 -0700

I am using Avro v1.7.7 in development, and Avro version 1.7.4 on my Hadoop
cluster.


I have a fairly large .avdl schema - a record with about 100 fields. When
running locally under test there were no issues with this schema,
everything would serialize and deserialize without issue.

When running on Hadoop however I was getting this error:

*Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)*

The reason was that the JSON schema embedded in the compiled java class was
being broken into two:

*public class SomeType extends org.apache.avro.specific.SpecificRecordBase
implements org.apache.avro.specific.SpecificRecord {*
*  public static final org.apache.avro.Schema SCHEMA$ = new
org.apache.avro.Schema.Parser().parse("long schema string part1", "long
schema string part2")*

Now, version 1.7.7 has this method signature:

*public Schema parse(String s, String... more)*

So the broken schema string works fine locally, but version 1.7.4. does
not, hence the exception when running compiled classes on Hadoop.

Is this intentional or a bug?
If intentional, what are the rules determining when Avro breaks up a schema
string?
Where is this behaviour documented?
Why does it do it at all?

Thanks

Avro divides large JSON schema string into parts - is this intentional?

Reply via email to