[
https://issues.apache.org/jira/browse/AVRO-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799364#comment-13799364
]
Hari Shreedharan commented on AVRO-1387:
----------------------------------------
[~cutting] - Thanks! That sounds like a good idea. I will take a look at this.
On a side note - assuming that each record is not very large (a few hundreds of
bytes to a few KB), and the writes happen to a local FS, is it reasonable to
write one block per record and write a sync marker at the end of a record? In
that case, I think the Snappy codec would make sense for us.
> Avro container file format update to write checksums for individual record
> --------------------------------------------------------------------------
>
> Key: AVRO-1387
> URL: https://issues.apache.org/jira/browse/AVRO-1387
> Project: Avro
> Issue Type: Bug
> Reporter: Hari Shreedharan
>
> We are considering changes in Flume's file channel to use Avro, one of the
> requirements is that each event (which maps to one avro record) be
> checksummed so we know if the data is corrupt.
> We'd probably have to add a new version for this, since this will change the
> data format on disk. I can start working on a Java version if there are no
> objections
--
This message was sent by Atlassian JIRA
(v6.1#6144)