To my knowledge, LZO is not a supported codec for Avro data files. It's possible that you have a LZO-compressed Hadoop sequence file containing Avro records, but that would be a format you defined yourself, and not the same as an Avro data file.
Avro data files are designed to be splittable regardless of the codec they use, so you can have multiple mappers that each consume a portion of the input file. The format achieves that by breaking the data into blocks, and compressing each block separately; hence it can be split at block boundaries. Best, Martin On 22 April 2013 23:47, nir_zamir <[email protected]> wrote: > Thanks Martin. > > What will happen if I try to use an indexed LZO-compressed avro file? Will > it work and utilize the index to allow multiple mappers? > > I think that for Snappy for example, the file is splittable and can use > multiple mappers, but I haven't tested it yet - would be glad if anyone has > any experience with that. > > Thanks! > Nir. > > > > -- > View this message in context: > http://apache-avro.679487.n3.nabble.com/map-reduce-of-compressed-Avro-tp4026947p4027009.html > Sent from the Avro - Users mailing list archive at Nabble.com. >
