Espen Amble Kolstad wrote:
Hi,
I changed LzoCompressor.finished() from:
public synchronized boolean finished() {
// ...
return (finished && compressedDirectBuf.remaining() == 0);
}
to:
public synchronized boolean finished() {
// ...
return (finish && compressedDirectBuf.remaining() == 0);
}
And it seems to work correctly now. I used CompressionCodecFactory.main
to test this. It failed before the change, and works after the change.
Both compress and decompress works.
Could you verify Arun? I'll do some more testing.
Sorry Espen, I've been busy with some 0.13.0 blockers...
It's been a while, I rechecked that part... that part of the code is
correct. The 'finished' variable is set in the native code and needs to
be checked to ensure all data is compressed/decompressed.
Could you take a look at TestCodec.java
(src/test/org/apache/hadoop/io/compress) and see if there is something
you can pick up from there? I'll keep looking at my end.
thanks,
Arun
thanks,
Espen
Espen Amble Kolstad wrote:
Hi Arun,
Arun C Murthy wrote:
Espen,
On Thu, May 24, 2007 at 03:49:38PM +0200, Espen Amble Kolstad wrote:
Hi,
I've been trying to use LzoCodec to write a compressed file:
Could you try this command:
$ bin/hadoop jar build/hadoop-0.12.4-dev-test.jar testsequencefile -seed 0
-count 10000 -compressType RECORD blah.seq -codec
org.apache.hadoop.io.compress.LzoCodec -check
This works like it should:
07/05/25 08:29:07 INFO io.SequenceFile: count = 10000
07/05/25 08:29:07 INFO io.SequenceFile: megabytes = 1
07/05/25 08:29:07 INFO io.SequenceFile: factor = 10
07/05/25 08:29:07 INFO io.SequenceFile: create = true
07/05/25 08:29:07 INFO io.SequenceFile: seed = 0
07/05/25 08:29:07 INFO io.SequenceFile: rwonly = false
07/05/25 08:29:07 INFO io.SequenceFile: check = true
07/05/25 08:29:07 INFO io.SequenceFile: fast = false
07/05/25 08:29:07 INFO io.SequenceFile: merge = false
07/05/25 08:29:07 INFO io.SequenceFile: compressType = RECORD
07/05/25 08:29:07 INFO io.SequenceFile: compressionCodec =
org.apache.hadoop.io.compress.LzoCodec
07/05/25 08:29:07 INFO io.SequenceFile: file = blah.seq
07/05/25 08:29:07 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
07/05/25 08:29:07 INFO compress.LzoCodec: Successfully loaded &
initialized native-lzo library
07/05/25 08:29:07 INFO io.SequenceFile: creating 10000 records with
RECORD compression
07/05/25 08:29:13 INFO io.SequenceFile: writing intermediate results to
/tmp/hadoop-espen/mapred/local/intermediate.1
07/05/25 08:29:15 INFO io.SequenceFile: done sorting 10000 debug
07/05/25 08:29:15 INFO io.SequenceFile: sorting 10000 records in memory
for debug
I think the difference, is that I try to write to the stream twice. It
seems hadoop-code always writes all bytes at once.
The code in LzoCompressor checks for userBufLen <= 0 and sets finished =
true, userBufLen is set in setInput(). This results in that you can only
write to the stream once?!
- Espen
LzoCodec seems to work fine for me... maybe your FileOutputStream was somehow
corrupted?
thanks,
Arun
public class LzoTest {
public static void main(String[] args) throws Exception {
final LzoCodec codec = new LzoCodec();
codec.setConf(new Configuration());
final CompressionOutputStream out = codec.createOutputStream(new
FileOutputStream("test.lzo"));
out.write("abc".getBytes());
out.write("def".getBytes());
out.close();
}
}
I get the following output:
07/05/24 15:44:22 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
07/05/24 15:44:22 INFO compress.LzoCodec: Successfully loaded &
initialized native-lzo library
Exception in thread "main" java.io.IOException: write beyond end of stream
at
org.apache.hadoop.io.compress.BlockCompressorStream.write(BlockCompressorStream.java:68)
at java.io.OutputStream.write(OutputStream.java:58)
at no.trank.tI.LzoTest.main(LzoTest.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90)
Isn't it possible to use LzoCodec for this purpose, or is this a bug?
- Espen