Hi,

I have Avro files compressed with deflate (compression level 9). I am
wondering if increasing the sync interval, which to my understanding
implies increasing the size of each Avro block, would lead to better
compression ratios.

I see that suggested values for the sync interval
<https://avro.apache.org/docs/1.7.6/api/java/org/apache/avro/file/DataFileWriter.html#setSyncInterval(int)>
are between 2KB and 2MB. However, I have been unable to find any
explanation *why* those are the optimal intervals. Given that my HDFS block
size is something around 128MB, why is the max suggested sync interval only
2MB?

Thanks,
Ben

Reply via email to