Hi Vinod, In Avro, compression is provided only at the file container level (i.e. block compression).
For compressing a simple byte array, you can rely on the Hadoop's compression classes such as a GzipCodec [1] to compress the byte stream directly (wrapping via a compressed output stream [2] got by its helper method [3]). Something like this, for example (I've not tested it out): ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); GzipCodec codec = ReflectionUtils.newInstance(GzipCodec.class, new Configuration()); OutputStream compressedOutputStream = codec.createOutputStream(outputStream); [… Encode over compressedOutputStream, etc. …] [1] - http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/GzipCodec.html [2] - http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/CompressorStream.html [3] - http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/GzipCodec.html#createOutputStream(java.io.OutputStream) On Tue, Apr 9, 2013 at 11:17 AM, Vinod Jammula <vinod.kumar.jamm...@ericsson.com> wrote: > Hi, > > I have a a csv string which I want to serialize, compress and write to a > database. > > I have the following code to serialize the string > > ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); > Encoder e = EncoderFactory.get().binaryEncoder(outputStream, null); > GenericDatumWriter w = new GenericDatumWriter(schema); > w.write(record, e) > byte[] avroBytes = outputStream.toByteArray(); > > > Following code to de-serialize and process the record. > > DatumReader<GenericRecord> reader = new > GenericDatumReader<GenericRecord>(schema); > > Decoder decoder = DecoderFactory.get().binaryDecoder(avroBytes, null); > > GenericRecord record = reader.read(decoder, null); > > > I find compression with DataFileWriter and DataFileReader. But how to enable > the compression for avro serialized buffer. > > Thanks and Regards, > Vinod -- Harsh J