radai rosenblatt created AVRO-3094:
--------------------------------------
Summary: performance regression in SpecificData.getForClass() when
run with code generated by older avro
Key: AVRO-3094
URL: https://issues.apache.org/jira/browse/AVRO-3094
Project: Apache Avro
Issue Type: Bug
Components: java
Affects Versions: 1.10.2, 1.9.2
Reporter: radai rosenblatt
starting with 1.9, avro SpecificData supports per-class MODEL$ definitions, and
looks for them on specific classes:
{code:java}
public static <T> SpecificData getForClass(Class<T> c) {
if (SpecificRecordBase.class.isAssignableFrom(c)) {
final Field specificDataField;
try {
specificDataField = c.getDeclaredField("MODEL$");
specificDataField.setAccessible(true);
return (SpecificData) specificDataField.get(null);
} catch (NoSuchFieldException e) {
// Return default instance
return SpecificData.get(); <======= EXPENSIVE
} catch (IllegalAccessException e) {
throw new AvroRuntimeException(e);
}
}
return SpecificData.get();
} {code}
when this is run vs specific record classes generated by older avro, which do
not have field MODEL$ this reslts in a serious performance degradation. we've
measured the impact on user code to be x3 slower in one case (see attached
flame graph)
under java 7+ it should be completely possible to cache the existence (or lack
thereof) of MODEL$ using
[ClassValue|https://docs.oracle.com/javase/7/docs/api/java/lang/ClassValue.html]
which would also speed this up when operating on classes generated by more
modern avro since it would avoid reflection
--
This message was sent by Atlassian Jira
(v8.3.4#803005)