If the schemas are identical, you can append: http://avro.apache.org/docs/current/api/java/org/apache/avro/file/DataFileWriter.html#appendAllFrom%28org.apache.avro.file.DataFileStream,%20boolean%29
If the compression codec is the same, it will just append block by block without re-serialization or re-compression (very fast). You can also force it to re-compress if you wish. On 4/21/11 4:28 PM, "Andrew Hammond" <[email protected]<mailto:[email protected]>> wrote: Suppose I have two avro data files containing a number of records. Can I simply concatenate them together to have a single avro data file without loosing any records or do I need to actually read them and then write them?
