kylin force to use lzo in hbase?

liangmeng Tue, 18 Aug 2015 22:38:27 -0700

in the doc, it says lzo compression is not used by default, but actually,  if 
cluster is configured with lzo, kylin will use it,  so i review the source 
code, and find that kylin determine to use lzo in hbase table by compression 
test result, not user's configuration;
also, we prefer snappy as default compression, will kylin support it ?


//////////////////////////////////////////////////
this is the source code in CreateHTableJob.java:

     for (HBaseColumnFamilyDesc cfDesc : 
cubeDesc.getHBaseMapping().getColumnFamily()) {
                HColumnDescriptor cf = new HColumnDescriptor(cfDesc.getName());
                cf.setMaxVersions(1);

                if (LZOSupportnessChecker.getSupportness()) {
                    logger.info("hbase will use lzo to compress data");
                    cf.setCompressionType(Algorithm.LZO);
                } else {
                    logger.info("hbase will not use lzo to compress data");
                }

                cf.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF);
                cf.setInMemory(false);
                cf.setBlocksize(4 * 1024 * 1024); // set to 4MB
                tableDesc.addFamily(cf);
            }



public class LZOSupportnessChecker {
    private static final Logger log = 
LoggerFactory.getLogger(LZOSupportnessChecker.class);

    public static boolean getSupportness() {
        try {
            File temp = File.createTempFile("test", ".tmp");
            CompressionTest.main(new String[] { "file://" + 
temp.getAbsolutePath(), "lzo" });
        } catch (Exception e) {
            log.error("Fail to compress file with lzo", e);
            return false;
        }
        return true;
    }

    public static void main(String[] args) throws Exception {
        System.out.println("LZO supported by current env? " + getSupportness());
    }
}

kylin force to use lzo in hbase?

Reply via email to