In general, you're probably better off with AvroKeyValueInputFormat/AvroKeyValueOutputFormat, since that generates Avro data files which you can read from other applications and other languages. Hadoop sequence files aren't really supported by anything other than Hadoop.
If your data remains entirely within Hadoop, there are cases where you might want to use sequence files. For example, it might be used for the transient files generated during the shuffle (output of mappers being fed into reducers). Martin On 20 May 2014, at 16:34, Jim Donofrio <[email protected]> wrote: > What are the pro's and con's of > AvroKeyValueInputFormat/AvroKeyValueOutputFormat vs > AvroSequenceFileInputFormat/AvroSequenceFileOutputFormat? Which is more > commonly used? > > They both use AvroKey, AvroValue. The only difference seems to be one > serializes into avro data files and other hadoop sequence files. > > Thanks
