wangying created AVRO-2396:
------------------------------
Summary: Huge performance regression on SpecificDatumReader for
array reading
Key: AVRO-2396
URL: https://issues.apache.org/jira/browse/AVRO-2396
Project: Apache Avro
Issue Type: Bug
Components: csharp
Affects Versions: 1.9.0
Reporter: wangying
The company where I'm working as a .NET developer is using Avro format for
message.
Recently, after upgrade to 1.9.0-rc2 and 1.9.0-rc4, there is a hug regression
the read array object.
Our test case reads a ETP defined object "Energistics.Datatypes.ChannelData"
inside which contains 5000 dataitems, with previous avro version, it only took
300~ms to read the data, which with the last version it wooks 1+ min. (the
protocol can found from [https://www.energistics.org/etp-developers-users/])
After look through the code, I find that should be caused by a change in
SpecificRecordAccess class
public object CreateRecord(object reuse)
{
return reuse ?? ObjectCreator.Instance.New(typeName, Schema.Type.Record);
}
Here, the reuse is null, thus ObjectCreator.Instance.New is run 5000 times and
each time use reflection to get specific types.
the previously version only do it in constructor:
private class SpecificRecordAccess : RecordAccess
{
private ObjectCreator.CtorDelegate objCreator;
public SpecificRecordAccess(RecordSchema readerSchema)
{
objCreator = GetConstructor(readerSchema.Fullname, Schema.Type.Record);
}
public object CreateRecord(object reuse)
{
return reuse ?? objCreator();
}
}
I'm trying to make a workaround and pass a reuse object in method public T
Read(T reuse, Decoder decoder), but still not working since the
SpecificDatumReader doesn't pass it through in below method.
public void AddElements( object array, int elements, int index, ReadItem
itemReader, Decoder decoder, bool reuse, object reuseobj)
{
var list = (IList)array;
for (int i=0; i < elements; i++)
{
list.Add( itemReader(null, decoder ) );
}
}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)