Hi Arun,
Arun C Murthy wrote:
> Espen,
>
> On Thu, May 24, 2007 at 03:49:38PM +0200, Espen Amble Kolstad wrote:
>> Hi,
>>
>> I've been trying to use LzoCodec to write a compressed file:
>>
>
> Could you try this command:
> $ bin/hadoop jar build/hadoop-0.12.4-dev-test.jar testsequencefile -seed 0
> -count 10000 -compressType RECORD blah.seq -codec
> org.apache.hadoop.io.compress.LzoCodec -check
This works like it should:
07/05/25 08:29:07 INFO io.SequenceFile: count = 10000
07/05/25 08:29:07 INFO io.SequenceFile: megabytes = 1
07/05/25 08:29:07 INFO io.SequenceFile: factor = 10
07/05/25 08:29:07 INFO io.SequenceFile: create = true
07/05/25 08:29:07 INFO io.SequenceFile: seed = 0
07/05/25 08:29:07 INFO io.SequenceFile: rwonly = false
07/05/25 08:29:07 INFO io.SequenceFile: check = true
07/05/25 08:29:07 INFO io.SequenceFile: fast = false
07/05/25 08:29:07 INFO io.SequenceFile: merge = false
07/05/25 08:29:07 INFO io.SequenceFile: compressType = RECORD
07/05/25 08:29:07 INFO io.SequenceFile: compressionCodec =
org.apache.hadoop.io.compress.LzoCodec
07/05/25 08:29:07 INFO io.SequenceFile: file = blah.seq
07/05/25 08:29:07 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
07/05/25 08:29:07 INFO compress.LzoCodec: Successfully loaded &
initialized native-lzo library
07/05/25 08:29:07 INFO io.SequenceFile: creating 10000 records with
RECORD compression
07/05/25 08:29:13 INFO io.SequenceFile: writing intermediate results to
/tmp/hadoop-espen/mapred/local/intermediate.1
07/05/25 08:29:15 INFO io.SequenceFile: done sorting 10000 debug
07/05/25 08:29:15 INFO io.SequenceFile: sorting 10000 records in memory
for debug
I think the difference, is that I try to write to the stream twice. It
seems hadoop-code always writes all bytes at once.
The code in LzoCompressor checks for userBufLen <= 0 and sets finished =
true, userBufLen is set in setInput(). This results in that you can only
write to the stream once?!
- Espen
>
> LzoCodec seems to work fine for me... maybe your FileOutputStream was somehow
> corrupted?
>
> thanks,
> Arun
>
>> public class LzoTest {
>>
>> public static void main(String[] args) throws Exception {
>> final LzoCodec codec = new LzoCodec();
>> codec.setConf(new Configuration());
>> final CompressionOutputStream out = codec.createOutputStream(new
>> FileOutputStream("test.lzo"));
>> out.write("abc".getBytes());
>> out.write("def".getBytes());
>> out.close();
>> }
>> }
>>
>> I get the following output:
>>
>> 07/05/24 15:44:22 INFO util.NativeCodeLoader: Loaded the native-hadoop
>> library
>> 07/05/24 15:44:22 INFO compress.LzoCodec: Successfully loaded &
>> initialized native-lzo library
>> Exception in thread "main" java.io.IOException: write beyond end of stream
>> at
>> org.apache.hadoop.io.compress.BlockCompressorStream.write(BlockCompressorStream.java:68)
>> at java.io.OutputStream.write(OutputStream.java:58)
>> at no.trank.tI.LzoTest.main(LzoTest.java:19)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90)
>>
>> Isn't it possible to use LzoCodec for this purpose, or is this a bug?
>>
>> - Espen