[
https://issues.apache.org/jira/browse/AVRO-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hari Shreedharan resolved AVRO-1393.
------------------------------------
Resolution: Not A Problem
> SyncInterval logic always causes blocks to be larger than the sync interval
> ---------------------------------------------------------------------------
>
> Key: AVRO-1393
> URL: https://issues.apache.org/jira/browse/AVRO-1393
> Project: Avro
> Issue Type: Bug
> Reporter: Hari Shreedharan
>
> If sync interval in the container file is set to be exactly block size, then
> the sync marker will be slightly larger than the block as we check the size
> of the file only after writing data to the stream. This means that sync
> interval is essentially the smallest interval between sync markers.
> Since we cannot predict the serialized size of the datum, we can never know
> how much data will overflow the block. Whatever the case, this might be more
> expensive than expected especially on systems like HDFS.
> Fixing this is difficult without breaking a bunch of interfaces, so opening
> this jira for discussion with people with more knowledge of the code.
--
This message was sent by Atlassian JIRA
(v6.1#6144)