[ 
https://issues.apache.org/jira/browse/AVRO-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Shreedharan resolved AVRO-1393.
------------------------------------

    Resolution: Not A Problem

> SyncInterval logic always causes blocks to be larger than the sync interval
> ---------------------------------------------------------------------------
>
>                 Key: AVRO-1393
>                 URL: https://issues.apache.org/jira/browse/AVRO-1393
>             Project: Avro
>          Issue Type: Bug
>            Reporter: Hari Shreedharan
>
> If sync interval in the container file is set to be exactly block size, then 
> the sync marker will be slightly larger than the block as we check the size 
> of the file only after writing data to the stream. This means that sync 
> interval is essentially the smallest interval between sync markers. 
> Since we cannot predict the serialized size of the datum, we can never know 
> how much data will overflow the block. Whatever the case, this might be more 
> expensive than expected especially on systems like HDFS.
> Fixing this is difficult without breaking a bunch of interfaces, so opening 
> this jira for discussion with people with more knowledge of the code.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to