if You are getting a SIGSEG it never hurts to try a more recent JVM. 21 has many bug fixes at this point.
On Tue, May 22, 2012 at 11:45 AM, Jason B <urg...@gmail.com> wrote: > JIRA entry created: > > https://issues.apache.org/jira/browse/HADOOP-8423 > > > On 5/21/12, Jason B <urg...@gmail.com> wrote: >> Sorry about using attachment. The code is below for the reference. >> (I will also file a jira as you suggesting) >> >> package codectest; >> >> import com.hadoop.compression.lzo.LzoCodec; >> import java.io.IOException; >> import java.util.Formatter; >> import org.apache.hadoop.conf.Configuration; >> import org.apache.hadoop.fs.FileSystem; >> import org.apache.hadoop.io.MapFile; >> import org.apache.hadoop.io.SequenceFile.CompressionType; >> import org.apache.hadoop.io.Text; >> import org.apache.hadoop.io.compress.CompressionCodec; >> import org.apache.hadoop.io.compress.DefaultCodec; >> import org.apache.hadoop.io.compress.SnappyCodec; >> import org.apache.hadoop.util.Tool; >> import org.apache.hadoop.util.ToolRunner; >> >> public class MapFileCodecTest implements Tool { >> private Configuration conf = new Configuration(); >> >> private void createMapFile(Configuration conf, FileSystem fs, String >> path, >> CompressionCodec codec, CompressionType type, int records) >> throws IOException { >> MapFile.Writer writer = new MapFile.Writer(conf, fs, path, >> Text.class, Text.class, >> type, codec, null); >> Text key = new Text(); >> for (int j = 0; j < records; j++) { >> StringBuilder sb = new StringBuilder(); >> Formatter formatter = new Formatter(sb); >> formatter.format("%03d", j); >> key.set(sb.toString()); >> writer.append(key, key); >> } >> writer.close(); >> } >> >> private void testCodec(Configuration conf, Class<? extends >> CompressionCodec> clazz, >> CompressionType type, int records) throws IOException { >> FileSystem fs = FileSystem.getLocal(conf); >> try { >> System.out.println("Creating MapFiles with " + records + >> " records using codec " + clazz.getSimpleName()); >> String path = clazz.getSimpleName() + records; >> createMapFile(conf, fs, path, clazz.newInstance(), type, >> records); >> MapFile.Reader reader = new MapFile.Reader(fs, path, conf); >> Text key1 = new Text("002"); >> if (reader.get(key1, new Text()) != null) { >> System.out.println("1st key found"); >> } >> Text key2 = new Text("004"); >> if (reader.get(key2, new Text()) != null) { >> System.out.println("2nd key found"); >> } >> } catch (Throwable ex) { >> ex.printStackTrace(); >> } >> } >> >> @Override >> public int run(String[] strings) throws Exception { >> System.out.println("Using native library " + >> System.getProperty("java.library.path")); >> >> testCodec(conf, DefaultCodec.class, CompressionType.RECORD, 100); >> testCodec(conf, SnappyCodec.class, CompressionType.RECORD, 100); >> testCodec(conf, LzoCodec.class, CompressionType.RECORD, 100); >> >> testCodec(conf, DefaultCodec.class, CompressionType.RECORD, 10); >> testCodec(conf, SnappyCodec.class, CompressionType.RECORD, 10); >> testCodec(conf, LzoCodec.class, CompressionType.RECORD, 10); >> >> testCodec(conf, DefaultCodec.class, CompressionType.BLOCK, 100); >> testCodec(conf, SnappyCodec.class, CompressionType.BLOCK, 100); >> testCodec(conf, LzoCodec.class, CompressionType.BLOCK, 100); >> >> testCodec(conf, DefaultCodec.class, CompressionType.BLOCK, 10); >> testCodec(conf, SnappyCodec.class, CompressionType.BLOCK, 10); >> testCodec(conf, LzoCodec.class, CompressionType.BLOCK, 10); >> return 0; >> } >> >> @Override >> public void setConf(Configuration c) { >> this.conf = c; >> } >> >> @Override >> public Configuration getConf() { >> return conf; >> } >> >> public static void main(String[] args) throws Exception { >> ToolRunner.run(new MapFileCodecTest(), args); >> } >> >> } >> >> >> On 5/21/12, Todd Lipcon <t...@cloudera.com> wrote: >>> Hi Jason, >>> >>> Sounds like a bug. Unfortunately the mailing list strips attachments. >>> >>> Can you file a jira in the HADOOP project, and attach your test case >>> there? >>> >>> Thanks >>> Todd >>> >>> On Mon, May 21, 2012 at 3:57 PM, Jason B <urg...@gmail.com> wrote: >>>> I am using Cloudera distribution cdh3u1. >>>> >>>> When trying to check native codecs for better decompression >>>> performance such as Snappy or LZO, I ran into issues with random >>>> access using MapFile.Reader.get(key, value) method. >>>> First call of MapFile.Reader.get() works but a second call fails. >>>> >>>> Also I am getting different exceptions depending on number of entries >>>> in a map file. >>>> With LzoCodec and 10 record file, jvm gets aborted. >>>> >>>> At the same time the DefaultCodec works fine for all cases, as well as >>>> record compression for the native codecs. >>>> >>>> I created a simple test program (attached) that creates map files >>>> locally with sizes of 10 and 100 records for three codecs: Default, >>>> Snappy, and LZO. >>>> (The test requires corresponding native library available) >>>> >>>> The summary of problems are given below: >>>> >>>> Map Size: 100 >>>> Compression: RECORD >>>> ================== >>>> DefaultCodec: OK >>>> SnappyCodec: OK >>>> LzoCodec: OK >>>> >>>> Map Size: 10 >>>> Compression: RECORD >>>> ================== >>>> DefaultCodec: OK >>>> SnappyCodec: OK >>>> LzoCodec: OK >>>> >>>> Map Size: 100 >>>> Compression: BLOCK >>>> ================ >>>> DefaultCodec: OK >>>> >>>> SnappyCodec: java.io.EOFException at >>>> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114) >>>> >>>> LzoCodec: java.io.EOFException at >>>> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114) >>>> >>>> Map Size: 10 >>>> Compression: BLOCK >>>> ================== >>>> DefaultCodec: OK >>>> >>>> SnappyCodec: java.lang.NoClassDefFoundError: Ljava/lang/InternalError >>>> at >>>> org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native >>>> Method) >>>> >>>> LzoCodec: >>>> # >>>> # A fatal error has been detected by the Java Runtime Environment: >>>> # >>>> # SIGSEGV (0xb) at pc=0x00002b068ffcbc00, pid=6385, tid=47304763508496 >>>> # >>>> # JRE version: 6.0_21-b07 >>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0-b17 mixed mode >>>> linux-amd64 ) >>>> # Problematic frame: >>>> # C [liblzo2.so.2+0x13c00] lzo1x_decompress+0x1a0 >>>> # >>>> # An error report file with more information is saved as: >>>> # /hadoop/user/yurgis/testapp/hs_err_pid6385.log >>>> # >>>> # If you would like to submit a bug report, please visit: >>>> # http://java.sun.com/webapps/bugreport/crash.jsp >>>> # The crash happened outside the Java Virtual Machine in native code. >>>> # See problematic frame for where to report the bug. >>>> # >>> >>> >>> >>> -- >>> Todd Lipcon >>> Software Engineer, Cloudera >>> >>