Hi Johan, On Tue, Mar 13, 2007 at 05:50:21PM +0000, Johan Oskarsson wrote: >Hi. > >I can't seem to find out how to set the compression codec in a >SequenceFile if it's created when a program runs with the output format >set to SequenceFileOutputFormat. >
Use these knobs: mapred.output.compress (set this to 'true') io.seqfile.compression.type (NONE, RECORD, BLOCK) mapred.output.compression.codec (zlib, lzo etc.) @see SequenceFileOutputFormat.getRecordWriter() for more info... hth, Arun > >Another question while I'm at it. >Currently I'm using normal text files for most data. I'd like to switch >to sequence files or map files. The program in question consists of two >jobs, first one sums up all the data into userId, resourceId -> counter. >Second job sorts this by counter and userId and has the resourceId as >the value. The output format then flips it around so it's still in the >format: userId, resourceId, counter. > >How would one do this sorting in a nice way if the output is a sequence >file where I would like to keep one object called something like UserRes >as the key and a IntWritable as the value? > > >/Johan
