in the doc, it says lzo compression is not used by default, but actually, if
cluster is configured with lzo, kylin will use it, so i review the source
code, and find that kylin determine to use lzo in hbase table by compression
test result, not user's configuration;
also, we prefer snappy as default compression, will kylin support it ?
//////////////////////////////////////////////////
this is the source code in CreateHTableJob.java:
for (HBaseColumnFamilyDesc cfDesc :
cubeDesc.getHBaseMapping().getColumnFamily()) {
HColumnDescriptor cf = new HColumnDescriptor(cfDesc.getName());
cf.setMaxVersions(1);
if (LZOSupportnessChecker.getSupportness()) {
logger.info("hbase will use lzo to compress data");
cf.setCompressionType(Algorithm.LZO);
} else {
logger.info("hbase will not use lzo to compress data");
}
cf.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF);
cf.setInMemory(false);
cf.setBlocksize(4 * 1024 * 1024); // set to 4MB
tableDesc.addFamily(cf);
}
public class LZOSupportnessChecker {
private static final Logger log =
LoggerFactory.getLogger(LZOSupportnessChecker.class);
public static boolean getSupportness() {
try {
File temp = File.createTempFile("test", ".tmp");
CompressionTest.main(new String[] { "file://" +
temp.getAbsolutePath(), "lzo" });
} catch (Exception e) {
log.error("Fail to compress file with lzo", e);
return false;
}
return true;
}
public static void main(String[] args) throws Exception {
System.out.println("LZO supported by current env? " + getSupportness());
}
}