Remove byte-by-byte copying in RecordBuilderBase.defaultValue
-------------------------------------------------------------
Key: AVRO-985
URL: https://issues.apache.org/jira/browse/AVRO-985
Project: Avro
Issue Type: Improvement
Reporter: Douglas Kaminsky
In one section of RecordBuilderBase.defaultValue(Field) (quoted below) a
bytewise copy of the default object is created based on the JSON value
provided. However, this is an extremely inefficient operation and causes large
slowdowns when building large object sets, including latency spikes when the
binary encoder flushes.
A simple workaround for a majority of cases would be to have a separate code
path for "primitives" (fixed, string, boolean, int, double, enum, float, bytes)
that allows direct creation rather than a full bytewise copy (and subsequent
deep copy).
*_RecordBuilderBase.java_*:
{code}
// If not cached, get the default Java value by encoding the default JSON
// value and then decoding it:
if (defaultValue == null) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
encoder = EncoderFactory.get().binaryEncoder(baos, encoder);
ResolvingGrammarGenerator.encode(
encoder, field.schema(), defaultJsonValue);
encoder.flush();
decoder = DecoderFactory.get().binaryDecoder(
baos.toByteArray(), decoder);
defaultValue = new GenericDatumReader(
field.schema()).read(null, decoder);
defaultSchemaValues.putIfAbsent(field.pos(), defaultValue);
}
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira