BELUGA BEHR created AVRO-2049:
---------------------------------
Summary: Remove Superfluous Configuration From AvroSerializer
Key: AVRO-2049
URL: https://issues.apache.org/jira/browse/AVRO-2049
Project: Avro
Issue Type: Improvement
Components: java
Affects Versions: 1.8.2, 1.7.7
Reporter: BELUGA BEHR
Priority: Trivial
In the class {{org.apache.avro.hadoop.io.AvroSerializer}}, we see that the Avro
block size is configured with a hard-coded value and there is a request to
benchmark different buffer sizes.
{code:title=org.apache.avro.hadoop.io.AvroSerializer}
/**
* The block size for the Avro encoder.
*
* This number was copied from the AvroSerialization of
org.apache.avro.mapred in Avro 1.5.1.
*
* TODO(gwu): Do some benchmarking with different numbers here to see if it
is important.
*/
private static final int AVRO_ENCODER_BLOCK_SIZE_BYTES = 512;
/** An factory for creating Avro datum encoders. */
private static EncoderFactory mEncoderFactory
= new EncoderFactory().configureBlockSize(AVRO_ENCODER_BLOCK_SIZE_BYTES);
{code}
However, there is no need to benchmark, this setting is superfluous and is
ignored with the current implementation.
{code:title=org.apache.avro.hadoop.io.AvroSerializer}
@Override
public void open(OutputStream outputStream) throws IOException {
mOutputStream = outputStream;
mAvroEncoder = mEncoderFactory.binaryEncoder(outputStream, mAvroEncoder);
}
{code}
{{org.apache.avro.io.EncoderFactory.binaryEncoder}} ignores this setting. This
setting is only relevant for calls to
{{org.apache.avro.io.EncoderFactory.blockingBinaryEncoder}}
which considers the configured "Block Size" for doing binary encoding of
blocked Array types as laid out in the
[specs|https://avro.apache.org/docs/1.8.2/spec.html#binary_encode_complex]. It
can simply be removed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)